apache_beam.typehints.schemas module

Support for mapping python types to proto Schemas and back again.

Imposes a mapping between common Python types and Beam portable schemas (https://s.apache.org/beam-schemas):

Python              Schema
np.int8     <-----> BYTE
np.int16    <-----> INT16
np.int32    <-----> INT32
np.int64    <-----> INT64
int         ------> INT64
np.float32  <-----> FLOAT
np.float64  <-----> DOUBLE
float       ------> DOUBLE
bool        <-----> BOOLEAN
str/unicode <-----> STRING
bytes       <-----> BYTES
ByteString  ------> BYTES
Timestamp   <-----> LogicalType(urn="beam:logical_type:micros_instant:v1")
Mapping     <-----> MapType
Sequence    <-----> ArrayType
NamedTuple  <-----> RowType
beam.Row    ------> RowType

Note that some of these mappings are provided as conveniences, but they are lossy and will not survive a roundtrip from python to Beam schemas and back. For example, the Python type int will map to INT64 in Beam schemas but converting that back to a Python type will yield np.int64.

nullable=True on a Beam FieldType is represented in Python by wrapping the type in Optional.

class apache_beam.typehints.schemas.SchemaTypeRegistry[source]

Bases: object

add(typing, schema)[source]
get_typing_by_id(unique_id)[source]
get_schema_by_id(unique_id)[source]
apache_beam.typehints.schemas.named_fields_to_schema(names_and_types)[source]
apache_beam.typehints.schemas.named_fields_from_schema(schema)[source]
apache_beam.typehints.schemas.typing_to_runner_api(type_)[source]
apache_beam.typehints.schemas.typing_from_runner_api(fieldtype_proto)[source]
apache_beam.typehints.schemas.named_tuple_from_schema(schema)[source]
apache_beam.typehints.schemas.named_tuple_to_schema(named_tuple)[source]
apache_beam.typehints.schemas.schema_from_element_type(element_type)[source]

Get a schema for the given PCollection element_type.

Returns schema as a list of (name, python_type) tuples

apache_beam.typehints.schemas.named_fields_from_element_type(element_type)[source]
class apache_beam.typehints.schemas.LogicalTypeRegistry[source]

Bases: object

add(urn, logical_type)[source]
get_logical_type_by_urn(urn)[source]
get_urn_by_logial_type(logical_type)[source]
get_logical_type_by_language_type(representation_type)[source]
class apache_beam.typehints.schemas.LogicalType[source]

Bases: typing.Generic

classmethod urn()[source]

Return the URN used to identify this logical type

classmethod language_type()[source]

Return the language type this LogicalType encodes.

The returned type should match LanguageT

classmethod representation_type()[source]

Return the type of the representation this LogicalType uses to encode the language type.

The returned type should match RepresentationT

classmethod argument_type()[source]

Return the type of the argument used for variations of this LogicalType.

The returned type should match ArgT

argument()[source]

Return the argument for this instance of the LogicalType.

to_representation_type()[source]

Convert an instance of LanguageT to RepresentationT.

to_language_type()[source]

Convert an instance of RepresentationT to LanguageT.

classmethod register_logical_type(logical_type_cls)[source]

Register an implementation of LogicalType.

classmethod from_typing(typ)[source]

Construct an instance of a registered LogicalType implementation given a typing.

Raises ValueError if no registered LogicalType implementation can encode the given typing.

classmethod from_runner_api(logical_type_proto)[source]

Construct an instance of a registered LogicalType implementation given a proto LogicalType.

Raises ValueError if no LogicalType registered for the given URN.

class apache_beam.typehints.schemas.NoArgumentLogicalType[source]

Bases: apache_beam.typehints.schemas.LogicalType

classmethod argument_type()[source]
argument()[source]
classmethod from_runner_api(logical_type_proto)

Construct an instance of a registered LogicalType implementation given a proto LogicalType.

Raises ValueError if no LogicalType registered for the given URN.

classmethod from_typing(typ)

Construct an instance of a registered LogicalType implementation given a typing.

Raises ValueError if no registered LogicalType implementation can encode the given typing.

classmethod language_type()

Return the language type this LogicalType encodes.

The returned type should match LanguageT

classmethod register_logical_type(logical_type_cls)

Register an implementation of LogicalType.

classmethod representation_type()

Return the type of the representation this LogicalType uses to encode the language type.

The returned type should match RepresentationT

to_language_type()

Convert an instance of RepresentationT to LanguageT.

to_representation_type()

Convert an instance of LanguageT to RepresentationT.

classmethod urn()

Return the URN used to identify this logical type

class apache_beam.typehints.schemas.MicrosInstantRepresentation(seconds, micros)

Bases: tuple

Create new instance of MicrosInstantRepresentation(seconds, micros)

count()

Return number of occurrences of value.

index()

Return first index of value.

Raises ValueError if the value is not present.

micros

Alias for field number 1

seconds

Alias for field number 0