apache_beam.runners.dataflow.test_dataflow_runner module¶
Wrapper of Beam runners that’s built for running and verifying e2e tests.
-
class
apache_beam.runners.dataflow.test_dataflow_runner.TestDataflowRunner(cache=None)[source]¶ Bases:
apache_beam.runners.dataflow.dataflow_runner.DataflowRunner-
wait_until_in_state(expected_state, timeout=600)[source]¶ Wait until Dataflow pipeline enters a certain state.
-
class
CombineValuesPTransformOverride¶ Bases:
apache_beam.pipeline.PTransformOverrideA
PTransformOverrideforCombineValues.The DataflowRunner expects that the CombineValues PTransform acts as a primitive. So this override replaces the CombineValues with a primitive.
-
get_replacement_inputs(applied_ptransform)¶ Provides inputs that will be passed to the replacement PTransform.
Parameters: applied_ptransform – Original AppliedPTransform containing the PTransform to be replaced. Returns: An iterable of PValues that will be passed to the expand() method of the replacement PTransform.
-
get_replacement_transform(ptransform)¶
-
get_replacement_transform_for_applied_ptransform(applied_ptransform)¶ Provides a runner specific override for a given AppliedPTransform.
Parameters: applied_ptransform – AppliedPTransform containing the PTransform to be replaced. Returns: A PTransform that will be the replacement for the PTransform inside the AppliedPTransform given as an argument.
-
matches(applied_ptransform)¶
-
-
class
CreatePTransformOverride¶ Bases:
apache_beam.pipeline.PTransformOverrideA
PTransformOverrideforCreatein streaming mode.-
get_replacement_inputs(applied_ptransform)¶ Provides inputs that will be passed to the replacement PTransform.
Parameters: applied_ptransform – Original AppliedPTransform containing the PTransform to be replaced. Returns: An iterable of PValues that will be passed to the expand() method of the replacement PTransform.
-
get_replacement_transform(ptransform)¶ Provides a runner specific override for a given PTransform.
Parameters: ptransform – PTransform to be replaced. Returns: A PTransform that will be the replacement for the PTransform given as an argument.
-
get_replacement_transform_for_applied_ptransform(applied_ptransform)¶
-
matches(applied_ptransform)¶
-
-
class
JrhReadPTransformOverride¶ Bases:
apache_beam.pipeline.PTransformOverrideA
PTransformOverrideforRead(BoundedSource)-
get_replacement_inputs(applied_ptransform)¶ Provides inputs that will be passed to the replacement PTransform.
Parameters: applied_ptransform – Original AppliedPTransform containing the PTransform to be replaced. Returns: An iterable of PValues that will be passed to the expand() method of the replacement PTransform.
-
get_replacement_transform(ptransform)¶ Provides a runner specific override for a given PTransform.
Parameters: ptransform – PTransform to be replaced. Returns: A PTransform that will be the replacement for the PTransform given as an argument.
-
get_replacement_transform_for_applied_ptransform(applied_ptransform)¶
-
matches(applied_ptransform)¶
-
-
class
NativeReadPTransformOverride¶ Bases:
apache_beam.pipeline.PTransformOverrideA
PTransformOverrideforReadusing native sources.The DataflowRunner expects that the Read PTransform using native sources act as a primitive. So this override replaces the Read with a primitive.
-
get_replacement_inputs(applied_ptransform)¶ Provides inputs that will be passed to the replacement PTransform.
Parameters: applied_ptransform – Original AppliedPTransform containing the PTransform to be replaced. Returns: An iterable of PValues that will be passed to the expand() method of the replacement PTransform.
-
get_replacement_transform(ptransform)¶
-
get_replacement_transform_for_applied_ptransform(applied_ptransform)¶ Provides a runner specific override for a given AppliedPTransform.
Parameters: applied_ptransform – AppliedPTransform containing the PTransform to be replaced. Returns: A PTransform that will be the replacement for the PTransform inside the AppliedPTransform given as an argument.
-
matches(applied_ptransform)¶
-
-
class
ReadPTransformOverride¶ Bases:
apache_beam.pipeline.PTransformOverrideA
PTransformOverrideforRead(BoundedSource)-
get_replacement_inputs(applied_ptransform)¶ Provides inputs that will be passed to the replacement PTransform.
Parameters: applied_ptransform – Original AppliedPTransform containing the PTransform to be replaced. Returns: An iterable of PValues that will be passed to the expand() method of the replacement PTransform.
-
get_replacement_transform(ptransform)¶ Provides a runner specific override for a given PTransform.
Parameters: ptransform – PTransform to be replaced. Returns: A PTransform that will be the replacement for the PTransform given as an argument.
-
get_replacement_transform_for_applied_ptransform(applied_ptransform)¶
-
matches(applied_ptransform)¶
-
-
add_pcoll_with_auto_sharding(applied_ptransform)¶
-
apply(transform, input, options)¶
-
apply_GroupByKey(transform, pcoll, options)¶
-
apply_PTransform(transform, input, options)¶
-
static
byte_array_to_json_string(raw_bytes)¶ Implements org.apache.beam.sdk.util.StringUtils.byteArrayToJsonString.
-
static
combinefn_visitor()¶
-
classmethod
deserialize_windowing_strategy(serialized_data)¶
-
static
flatten_input_visitor()¶
-
get_default_gcp_region()¶ Get a default value for Google Cloud region according to https://cloud.google.com/compute/docs/gcloud-compute/#default-properties. If no default can be found, returns None.
-
get_pcoll_with_auto_sharding()¶
-
is_fnapi_compatible()¶
-
static
json_string_to_byte_array(encoded_string)¶ Implements org.apache.beam.sdk.util.StringUtils.jsonStringToByteArray.
-
static
poll_for_job_completion(runner, result, duration)¶ Polls for the specified job to finish running (successfully or not).
Updates the result with the new job information before returning.
Parameters:
-
run(transform, options=None)¶ Run the given transform or callable with this runner.
Blocks until the pipeline is complete. See also PipelineRunner.run_async.
-
run_CombineValuesReplacement(transform_node, options)¶
-
run_ExternalTransform(transform_node, options)¶
-
run_Flatten(transform_node, options)¶
-
run_GroupByKey(transform_node, options)¶
-
run_Impulse(transform_node, options)¶
-
run_ParDo(transform_node, options)¶
-
run_Read(transform_node, options)¶
-
run_TestStream(transform_node, options)¶
-
run__NativeWrite(transform_node, options)¶
-
run_async(transform, options=None)¶ Run the given transform or callable with this runner.
May return immediately, executing the pipeline in the background. The returned result object can be queried for progress, and wait_until_finish may be called to block until completion.
-
run_transform(transform_node, options)¶ Runner callback for a pipeline.run call.
Parameters: transform_node – transform node for the transform to run. A concrete implementation of the Runner class must implement run_Abc for some class Abc in the method resolution order for every non-composite transform Xyz in the pipeline.
-
classmethod
serialize_windowing_strategy(windowing, default_environment)¶
-
static
side_input_visitor(use_unified_worker=False, use_fn_api=False, deterministic_key_coders=True)¶
-
visit_transforms(pipeline, options)¶
-