apache_beam.dataframe.partitionings module¶
-
class
apache_beam.dataframe.partitionings.
Partitioning
[source]¶ Bases:
object
A class representing a (consistent) partitioning of dataframe objects.
-
is_subpartitioning_of
(other)[source]¶ Returns whether self is a sub-partition of other.
Specifically, returns whether something partitioned by self is necissarily also partitioned by other.
-
-
class
apache_beam.dataframe.partitionings.
Index
(levels=None)[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning by index (either fully or partially).
If the set of “levels” of the index to consider is not specified, the entire index is used.
These form a partial order, given by
Singleton() < Index([i]) < Index([i, j]) < … < Index() < Arbitrary()The ordering is implemented via the is_subpartitioning_of method, where the examples on the right are subpartitionings of the examples on the left above.
-
test_partition_fn
(df)¶
-
-
class
apache_beam.dataframe.partitionings.
Singleton
[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning of all the data into a single partition.
-
test_partition_fn
(df)¶
-
-
class
apache_beam.dataframe.partitionings.
Arbitrary
[source]¶ Bases:
apache_beam.dataframe.partitionings.Partitioning
A partitioning imposing no constraints on the actual partitioning.
-
partition_fn
(df, num_partitions)¶ A callable that actually performs the partitioning of a Frame df.
This will be invoked via a FlatMap in conjunction with a GroupKey to achieve the desired partitioning.
-