apache_beam.io.utils module

Utils for the io library. * CountingSource: Subclass of iobase.BoundedSource. Used on transforms.ptransform_test.test_read_metrics.

class apache_beam.io.utils.CountingSource(count)[source]

Bases: apache_beam.io.iobase.BoundedSource

estimate_size()[source]
get_range_tracker(start_position, stop_position)[source]
read(range_tracker)[source]
split(desired_bundle_size, start_position=None, stop_position=None)[source]
default_output_coder()

Coder that should be used for the records returned by the source.

Should be overridden by sources that produce objects that can be encoded more efficiently than pickling.

display_data()

Returns the display data associated to a pipeline component.

It should be reimplemented in pipeline components that wish to have static display data.

Returns:A dictionary containing key:value pairs. The value might be an integer, float or string value; a DisplayDataItem for values that have more data (e.g. short value, label, url); or a HasDisplayData instance that has more display data that should be picked up. For example:
{
  'key1': 'string_value',
  'key2': 1234,
  'key3': 3.14159265,
  'key4': DisplayDataItem('apache.org', url='http://apache.org'),
  'key5': subComponent
}
Return type:Dict[str, Any]
classmethod from_runner_api(fn_proto, context)

Converts from an FunctionSpec to a Fn object.

Prefer registering a urn with its parameter type and constructor.

is_bounded()
classmethod register_pickle_urn(pickle_urn)

Registers and implements the given urn via pickling.

classmethod register_urn(urn, parameter_type, fn=None)

Registers a urn with a constructor.

For example, if ‘beam:fn:foo’ had parameter type FooPayload, one could write RunnerApiFn.register_urn(‘bean:fn:foo’, FooPayload, foo_from_proto) where foo_from_proto took as arguments a FooPayload and a PipelineContext. This function can also be used as a decorator rather than passing the callable in as the final parameter.

A corresponding to_runner_api_parameter method would be expected that returns the tuple (‘beam:fn:foo’, FooPayload)

to_runner_api(context)

Returns an FunctionSpec encoding this Fn.

Prefer overriding self.to_runner_api_parameter.

to_runner_api_parameter(context)