apache_beam.transforms.combiners module¶
A library of basic combiner PTransform subclasses.
-
class
apache_beam.transforms.combiners.Mean[source]¶ Bases:
objectCombiners for computing arithmetic means of elements.
-
class
Globally(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultscombiners.Mean.Globally computes the arithmetic mean of the elements.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
PerKey(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformcombiners.Mean.PerKey finds the means of the values for each key.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class
-
class
apache_beam.transforms.combiners.Count[source]¶ Bases:
objectCombiners for counting elements.
-
class
Globally(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultscombiners.Count.Globally counts the total number of elements.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
PerKey(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformcombiners.Count.PerKey counts how many elements each unique key has.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class
PerElement(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformcombiners.Count.PerElement counts how many times each element occurs.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class
-
class
apache_beam.transforms.combiners.Top[source]¶ Bases:
objectCombiners for obtaining extremal elements.
-
class
Of(n, **kwargs)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsObtain a list of the compare-most N elements in a PCollection.
This transform will retrieve the n greatest elements in the PCollection to which it is applied, where “greatest” is determined by the comparator function supplied as the compare argument.
Creates a global Top operation.
The arguments ‘key’ and ‘reverse’ may be passed as keyword arguments, and have the same meaning as for Python’s sort functions.
Parameters: - pcoll – PCollection to process.
- n – number of elements to extract from pcoll.
- **kwargs – may contain ‘key’ and/or ‘reverse’
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
class
PerKey(n, **kwargs)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformIdentifies the compare-most N elements associated with each key.
This transform will produce a PCollection mapping unique keys in the input PCollection to the n greatest elements with which they are associated, where “greatest” is determined by the comparator function supplied as the compare argument in the initializer.
Creates a per-key Top operation.
The arguments ‘key’ and ‘reverse’ may be passed as keyword arguments, and have the same meaning as for Python’s sort functions.
Parameters: - pcoll – PCollection to process.
- n – number of elements to extract from pcoll.
- **kwargs – may contain ‘key’ and/or ‘reverse’
-
expand(pcoll)[source]¶ Expands the transform.
Raises TypeCheckError: If the output type of the input PCollection is not compatible with Tuple[A, B].
Parameters: pcoll – PCollection to process Returns: the PCollection containing the result.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
static
Largest(pcoll, n, has_defaults=True)[source]¶ Obtain a list of the greatest N elements in a PCollection.
-
static
Smallest(pcoll, n, has_defaults=True)[source]¶ Obtain a list of the least N elements in a PCollection.
-
class
-
class
apache_beam.transforms.combiners.Sample[source]¶ Bases:
objectCombiners for sampling n elements without replacement.
-
class
FixedSizeGlobally(n)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsSample n elements from the input PCollection without replacement.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_type_hints()¶
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
FixedSizePerKey(n)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformSample n elements associated with each key without replacement.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_type_hints()¶
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class
-
class
apache_beam.transforms.combiners.ToList(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsA global CombineFn that condenses a PCollection into a single list.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
apache_beam.transforms.combiners.ToDict(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsA global CombineFn that condenses a PCollection into a single dict.
PCollections should consist of 2-tuples, notionally (key, value) pairs. If multiple values are associated with the same key, only one of the values will be present in the resulting dict.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
apache_beam.transforms.combiners.ToSet(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsA global CombineFn that condenses a PCollection into a set.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
apache_beam.transforms.combiners.Latest[source]¶ Bases:
objectCombiners for computing the latest element
-
class
Globally(has_defaults=True)[source]¶ Bases:
apache_beam.transforms.combiners.CombinerWithoutDefaultsCompute the element with the latest timestamp from a PCollection.
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_defaults(has_defaults=True)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
without_defaults()¶
-
-
class
PerKey(label=None)[source]¶ Bases:
apache_beam.transforms.ptransform.PTransformCompute elements with the latest timestamp for each key from a keyed PCollection
-
annotations() → Dict[str, Union[bytes, str, google.protobuf.message.Message]]¶
-
default_label()¶
-
default_type_hints()¶
-
display_data()¶ Returns the display data associated to a pipeline component.
It should be reimplemented in pipeline components that wish to have static display data.
Returns: A dictionary containing key:valuepairs. The value might be an integer, float or string value; aDisplayDataItemfor values that have more data (e.g. short value, label, url); or aHasDisplayDatainstance that has more display data that should be picked up. For example:{ 'key1': 'string_value', 'key2': 1234, 'key3': 3.14159265, 'key4': DisplayDataItem('apache.org', url='http://apache.org'), 'key5': subComponent }
Return type: Dict[str, Any]
-
classmethod
from_runner_api(proto, context)¶
-
get_type_hints()¶ Gets and/or initializes type hints for this object.
If type hints have not been set, attempts to initialize type hints in this order: - Using self.default_type_hints(). - Using self.__class__ type hints.
-
get_windowing(inputs)¶ Returns the window function to be associated with transform’s output.
By default most transforms just return the windowing function associated with the input PCollection (or the first input if several).
-
infer_output_type(unused_input_type)¶
-
label¶
-
pipeline= None¶
-
classmethod
register_urn(urn, parameter_type, constructor=None)¶
-
runner_api_requires_keyed_input()¶
-
side_inputs= ()¶
-
to_runner_api(context, has_parts=False, **extra_kwargs)¶
-
to_runner_api_parameter(unused_context)¶
-
to_runner_api_pickled(unused_context)¶
-
type_check_inputs(pvalueish)¶
-
type_check_inputs_or_outputs(pvalueish, input_or_output)¶
-
type_check_outputs(pvalueish)¶
-
with_input_types(input_type_hint)¶ Annotates the input type of a
PTransformwith a type-hint.Parameters: input_type_hint (type) – An instance of an allowed built-in type, a custom class, or an instance of a TypeConstraint.Raises: TypeError– If input_type_hint is not a valid type-hint. Seeapache_beam.typehints.typehints.validate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
with_output_types(type_hint)¶ Annotates the output type of a
PTransformwith a type-hint.Parameters: type_hint (type) – An instance of an allowed built-in type, a custom class, or a TypeConstraint.Raises: TypeError– If type_hint is not a valid type-hint. Seevalidate_composite_type_param()for further details.Returns: A reference to the instance of this particular PTransformobject. This allows chaining type-hinting related methods.Return type: PTransform
-
-
class