apache_beam.io.localfilesystem module¶
Local File system implementation for accessing files on disk.
-
class
apache_beam.io.localfilesystem.
LocalFileSystem
(pipeline_options)[source]¶ Bases:
apache_beam.io.filesystem.FileSystem
A Local
FileSystem
implementation for accessing files on disk.Parameters: pipeline_options – Instance of PipelineOptions
or dict of options and values (likeRuntimeValueProvider.runtime_options
).-
join
(basepath, *paths)[source]¶ Join two or more pathname components for the filesystem
Parameters: - basepath – string path of the first component of the path
- paths – path components to be added
Returns: full path after combining all the passed components
-
split
(path)[source]¶ Splits the given path into two parts.
Splits the path into a pair (head, tail) such that tail contains the last component of the path and head contains everything up to that.
Parameters: path – path as a string Returns: a pair of path components as strings.
-
mkdirs
(path)[source]¶ Recursively create directories for the provided path.
Parameters: path – string path of the directory structure that should be created Raises: IOError
– if leaf directory already exists.
-
create
(path, mime_type='application/octet-stream', compression_type='auto')[source]¶ Returns a write channel for the given file path.
Parameters: - path – string path of the file object to be written to the system
- mime_type – MIME type to specify the type of content in the file object
- compression_type – Type of compression to be used for this object
Returns: file handle with a close function for the user to use
-
open
(path, mime_type='application/octet-stream', compression_type='auto')[source]¶ Returns a read channel for the given file path.
Parameters: - path – string path of the file object to be written to the system
- mime_type – MIME type to specify the type of content in the file object
- compression_type – Type of compression to be used for this object
Returns: file handle with a close function for the user to use
-
copy
(source_file_names, destination_file_names)[source]¶ Recursively copy the file tree from the source to the destination
Parameters: - source_file_names – list of source file objects that needs to be copied
- destination_file_names – list of destination of the new object
Raises: BeamIOError
– if any of the copy operations fail
-
rename
(source_file_names, destination_file_names)[source]¶ Rename the files at the source list to the destination list. Source and destination lists should be of the same size.
Parameters: - source_file_names – List of file paths that need to be moved
- destination_file_names – List of destination_file_names for the files
Raises: BeamIOError
– if any of the rename operations fail
-
exists
(path)[source]¶ Check if the provided path exists on the FileSystem.
Parameters: path – string path that needs to be checked. Returns: boolean flag indicating if path exists
-
size
(path)[source]¶ Get size of path on the FileSystem.
Parameters: path – string path in question. Returns: int size of path according to the FileSystem.
Raises: BeamIOError
– if path doesn’t exist.
-
last_updated
(path)[source]¶ Get UNIX Epoch time in seconds on the FileSystem.
Parameters: path – string path of file. Returns: float UNIX Epoch time
Raises: BeamIOError
– if path doesn’t exist.
-
checksum
(path)[source]¶ Fetch checksum metadata of a file on the
FileSystem
.Parameters: path – string path of a file. Returns: string containing file size.
Raises: BeamIOError
– if path isn’t a file or doesn’t exist.
-
delete
(paths)[source]¶ Deletes files or directories at the provided paths. Directories will be deleted recursively.
Parameters: paths – list of paths that give the file objects to be deleted Raises: BeamIOError
– if any of the delete operations fail
-
CHUNK_SIZE
= 1¶
-
classmethod
get_all_plugin_paths
()¶ Get full import paths of the BeamPlugin subclass.
-
classmethod
get_all_subclasses
()¶ Get all the subclasses of the BeamPlugin class.
-
match
(patterns, limits=None)¶ Find all matching paths to the patterns provided.
See also
Patterns ending with ‘/’ or ‘’ will be appended with ‘*’.
Parameters: - patterns – list of string for the file path pattern to match against
- limits – list of maximum number of responses that need to be fetched
Returns: list of
MatchResult
objects.Raises: BeamIOError
– if any of the pattern match operations fail
-
match_files
(file_metas, pattern)¶ Filter
FileMetadata
objects by patternParameters: - file_metas (list of
FileMetadata
) – Files to consider when matching - pattern (str) – File pattern
See also
Returns: Generator of matching FileMetadata
- file_metas (list of
-
static
translate_pattern
(pattern)¶ Translate a pattern to a regular expression. There is no way to quote meta-characters.
- Pattern syntax:
The pattern syntax is based on the fnmatch syntax, with the following differences:
*
Is equivalent to[^/\]*
rather than.*
.**
Is equivalent to.*
.
See also
match()
uses this methodThis method is based on Python 2.7’s fnmatch.translate. The code in this method is licensed under PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2.
-