apache_beam.dataframe.partitionings module¶
-
class
apache_beam.dataframe.partitionings.Partitioning[source]¶ Bases:
objectA class representing a (consistent) partitioning of dataframe objects.
-
is_subpartitioning_of(other)[source]¶ Returns whether self is a sub-partition of other.
Specifically, returns whether something partitioned by self is necissarily also partitioned by other.
-
-
class
apache_beam.dataframe.partitionings.Index(levels=None)[source]¶ Bases:
apache_beam.dataframe.partitionings.PartitioningA partitioning by index (either fully or partially).
If the set of “levels” of the index to consider is not specified, the entire index is used.
These form a partial order, given by
Singleton() < Index([i]) < Index([i, j]) < … < Index() < Arbitrary()The ordering is implemented via the is_subpartitioning_of method, where the examples on the right are subpartitionings of the examples on the left above.
-
test_partition_fn(df)¶
-
-
class
apache_beam.dataframe.partitionings.Singleton[source]¶ Bases:
apache_beam.dataframe.partitionings.PartitioningA partitioning of all the data into a single partition.
-
test_partition_fn(df)¶
-
-
class
apache_beam.dataframe.partitionings.Arbitrary[source]¶ Bases:
apache_beam.dataframe.partitionings.PartitioningA partitioning imposing no constraints on the actual partitioning.
-
partition_fn(df, num_partitions)¶ A callable that actually performs the partitioning of a Frame df.
This will be invoked via a FlatMap in conjunction with a GroupKey to achieve the desired partitioning.
-