apache_beam.io.restriction_trackers module¶
iobase.RestrictionTracker implementations provided with Apache Beam.
-
class
apache_beam.io.restriction_trackers.
OffsetRestrictionTracker
(offset_range)[source]¶ Bases:
apache_beam.io.iobase.RestrictionTracker
An iobase.RestrictionTracker implementations for an offset range.
Offset range is represented as OffsetRange.
-
class
apache_beam.io.restriction_trackers.
UnsplittableRestrictionTracker
(underling_tracker)[source]¶ Bases:
apache_beam.io.iobase.RestrictionTracker
An iobase.RestrictionTracker that wraps another but does not split.
-
check_done
()¶ Checks whether the restriction has been fully processed.
Called by the SDK harness after iterator returned by
DoFn.process()
has been fully read.This method must raise a ValueError if there is still any unclaimed work remaining in the restriction when this method is invoked. Exception raised must have an informative error message.
This API is required to be implemented in order to make sure no data loss during SDK processing.
Returns:
True
if current restriction has been fully processed. :raises:ValueError
– if there is still any unclaimed work remaining.
-
current_progress
()¶ Returns a RestrictionProgress object representing the current progress.
This API is recommended to be implemented. The runner can do a better job at parallel processing with better progress signals.
-
current_restriction
()¶ Returns the current restriction.
Returns a restriction accurately describing the full range of work the current
DoFn.process()
call will do, including already completed work.The current restriction returned by method may be updated dynamically due to due to concurrent invocation of other methods of the
RestrictionTracker
, For example,split()
.This API is required to be implemented.
Returns: a restriction object.
-
is_bounded
()¶ Returns whether the amount of work represented by the current restriction is bounded.
The boundedness of the restriction is used to determine the default behavior of how to truncate restrictions when a pipeline is being drained. # pylint: disable=line-too-long If the restriction is bounded, then the entire restriction will be processed otherwise the restriction will be processed till a checkpoint is possible.
The API is required to be implemented.
Returns:
True
if the restriction represents a finite amount of work. Otherwise, returnsFalse
.
-
try_claim
(position)¶ Attempts to claim the block of work in the current restriction identified by the given position. Each claimed position MUST be a valid split point.
If this succeeds, the DoFn MUST execute the entire block of work. If it fails, the
DoFn.process()
MUST returnNone
without performing any additional work or emitting output (note that emitting output or performing work fromDoFn.process()
is also not allowed before the first call of this method).The API is required to be implemented.
Parameters: position – current position that wants to be claimed. Returns:
True
if the position can be claimed as current_position. Otherwise, returnsFalse
.
-