apache_beam.runners.direct.helper_transforms module¶
-
class
apache_beam.runners.direct.helper_transforms.LiftedCombinePerKey(combine_fn, args, kwargs)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformAn implementation of CombinePerKey that does mapper-side pre-combining.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class
apache_beam.runners.direct.helper_transforms.PartialGroupByKeyCombiningValues(combine_fn)[source]¶ Bases:
apache_beam.transforms.core.DoFnAggregates values into a per-key-window cache.
As bundles are in-memory-sized, we don’t bother flushing until the very end.
-
BundleFinalizerParam¶ alias of
apache_beam.transforms.core._BundleFinalizerParam
-
DoFnProcessParams= [ElementParam, SideInputParam, TimestampParam, WindowParam, <class 'apache_beam.transforms.core._WatermarkEstimatorParam'>, PaneInfoParam, <class 'apache_beam.transforms.core._BundleFinalizerParam'>, KeyParam, <class 'apache_beam.transforms.core._StateDoFnParam'>, <class 'apache_beam.transforms.core._TimerDoFnParam'>]¶
-
DynamicTimerTagParam= DynamicTimerTagParam¶
-
ElementParam= ElementParam¶
-
KeyParam= KeyParam¶
-
PaneInfoParam= PaneInfoParam¶
-
RestrictionParam¶ alias of
apache_beam.transforms.core._RestrictionDoFnParam
-
SideInputParam= SideInputParam¶
-
StateParam¶ alias of
apache_beam.transforms.core._StateDoFnParam
-
TimerParam¶ alias of
apache_beam.transforms.core._TimerDoFnParam
-
TimestampParam= TimestampParam¶
-
WatermarkEstimatorParam¶ alias of
apache_beam.transforms.core._WatermarkEstimatorParam
-
WindowParam= WindowParam¶
-
default_label()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
static
from_callable(fn)¶
-
classmethod
from_runner_api(fn_proto, context)¶ Converts from an FunctionSpec to a Fn object.
Prefer registering a urn with its parameter type and constructor.
-
get_function_arguments(func)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
infer_output_type(input_type)¶
-
classmethod
register_pickle_urn(pickle_urn)¶ Registers and implements the given urn via pickling.
-
classmethod
register_urn(urn, parameter_type, fn=None)¶ Registers a urn with a constructor.
For example, if ‘beam:fn:foo’ had parameter type FooPayload, one could write RunnerApiFn.register_urn(‘bean:fn:foo’, FooPayload, foo_from_proto) where foo_from_proto took as arguments a FooPayload and a PipelineContext. This function can also be used as a decorator rather than passing the callable in as the final parameter.
A corresponding to_runner_api_parameter method would be expected that returns the tuple (‘beam:fn:foo’, FooPayload)
-
to_runner_api(context)¶ Returns an FunctionSpec encoding this Fn.
Prefer overriding self.to_runner_api_parameter.
-
to_runner_api_parameter(context)¶
-
static
unbounded_per_element()¶ A decorator on process fn specifying that the fn performs an unbounded amount of work per input element.
-
with_input_types(*arg_hints, **kwarg_hints)¶
-
with_output_types(*arg_hints, **kwarg_hints)¶
-
-
class
apache_beam.runners.direct.helper_transforms.FinishCombine(combine_fn)[source]¶ Bases:
apache_beam.transforms.core.DoFnMerges partially combined results.
-
BundleFinalizerParam¶ alias of
apache_beam.transforms.core._BundleFinalizerParam
-
DoFnProcessParams= [ElementParam, SideInputParam, TimestampParam, WindowParam, <class 'apache_beam.transforms.core._WatermarkEstimatorParam'>, PaneInfoParam, <class 'apache_beam.transforms.core._BundleFinalizerParam'>, KeyParam, <class 'apache_beam.transforms.core._StateDoFnParam'>, <class 'apache_beam.transforms.core._TimerDoFnParam'>]¶
-
DynamicTimerTagParam= DynamicTimerTagParam¶
-
ElementParam= ElementParam¶
-
KeyParam= KeyParam¶
-
PaneInfoParam= PaneInfoParam¶
-
RestrictionParam¶ alias of
apache_beam.transforms.core._RestrictionDoFnParam
-
SideInputParam= SideInputParam¶
-
StateParam¶ alias of
apache_beam.transforms.core._StateDoFnParam
-
TimerParam¶ alias of
apache_beam.transforms.core._TimerDoFnParam
-
TimestampParam= TimestampParam¶
-
WatermarkEstimatorParam¶ alias of
apache_beam.transforms.core._WatermarkEstimatorParam
-
WindowParam= WindowParam¶
-
default_label()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
finish_bundle()¶ Called after a bundle of elements is processed on a worker.
-
static
from_callable(fn)¶
-
classmethod
from_runner_api(fn_proto, context)¶ Converts from an FunctionSpec to a Fn object.
Prefer registering a urn with its parameter type and constructor.
-
get_function_arguments(func)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
infer_output_type(input_type)¶
-
classmethod
register_pickle_urn(pickle_urn)¶ Registers and implements the given urn via pickling.
-
classmethod
register_urn(urn, parameter_type, fn=None)¶ Registers a urn with a constructor.
For example, if ‘beam:fn:foo’ had parameter type FooPayload, one could write RunnerApiFn.register_urn(‘bean:fn:foo’, FooPayload, foo_from_proto) where foo_from_proto took as arguments a FooPayload and a PipelineContext. This function can also be used as a decorator rather than passing the callable in as the final parameter.
A corresponding to_runner_api_parameter method would be expected that returns the tuple (‘beam:fn:foo’, FooPayload)
-
start_bundle()¶ Called before a bundle of elements is processed on a worker.
Elements to be processed are split into bundles and distributed to workers. Before a worker calls process() on the first element of its bundle, it calls this method.
-
to_runner_api(context)¶ Returns an FunctionSpec encoding this Fn.
Prefer overriding self.to_runner_api_parameter.
-
to_runner_api_parameter(context)¶
-
static
unbounded_per_element()¶ A decorator on process fn specifying that the fn performs an unbounded amount of work per input element.
-
with_input_types(*arg_hints, **kwarg_hints)¶
-
with_output_types(*arg_hints, **kwarg_hints)¶
-