apache_beam.io.filesystemio module¶
Utilities for FileSystem implementations.
-
class
apache_beam.io.filesystemio.Downloader[source]¶ Bases:
objectDownload interface for a single file.
Implementations should support random access reads.
-
size¶ Size of file to download.
-
get_range(start, end)[source]¶ Retrieve a given byte range [start, end) from this download.
- Range must be in this form:
- 0 <= start < end: Fetch the bytes from start to end.
Parameters: - start – (int) Initial byte offset.
- end – (int) Final byte offset, exclusive.
Returns: (string) A buffer containing the requested data.
-
-
class
apache_beam.io.filesystemio.Uploader[source]¶ Bases:
objectUpload interface for a single file.
-
class
apache_beam.io.filesystemio.DownloaderStream(downloader, read_buffer_size=8192, mode='rb')[source]¶ Bases:
io.RawIOBaseProvides a stream interface for Downloader objects.
Initializes the stream.
Parameters: - downloader – (Downloader) Filesystem dependent implementation.
- read_buffer_size – (int) Buffer size to use during read operations.
- mode – (string) Python mode attribute for this stream.
-
readinto(b)[source]¶ Read up to len(b) bytes into b.
Returns number of bytes read (0 for EOF).
Parameters: b – (bytearray/memoryview) Buffer to read into.
-
seek(offset, whence=0)[source]¶ Set the stream’s current offset.
Note if the new offset is out of bound, it is adjusted to either 0 or EOF.
Parameters: - offset – seek offset as number.
- whence – seek mode. Supported modes are os.SEEK_SET (absolute seek), os.SEEK_CUR (seek relative to the current position), and os.SEEK_END (seek relative to the end, offset should be negative).
Raises: ValueError– When this stream is closed or if whence is invalid.
-
tell()[source]¶ Tell the stream’s current offset.
Returns: current offset in reading this stream. Raises: ValueError– When this stream is closed.
-
close()¶ Flush and close the IO object.
This method has no effect if the file is already closed.
-
closed¶
-
fileno()¶ Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
-
flush()¶ Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
-
isatty()¶ Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
-
read()¶
-
readline()¶ Read and return a line from the stream.
If size is specified, at most size bytes will be read.
The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.
-
readlines()¶ Return a list of lines from the stream.
hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.
-
truncate()¶ Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
-
writable()¶ Return whether object was opened for writing.
If False, write() will raise OSError.
-
write()¶
-
writelines()¶ Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
-
class
apache_beam.io.filesystemio.UploaderStream(uploader, mode='wb')[source]¶ Bases:
io.RawIOBaseProvides a stream interface for Uploader objects.
Initializes the stream.
Parameters: - uploader – (Uploader) Filesystem dependent implementation.
- mode – (string) Python mode attribute for this stream.
-
write(b)[source]¶ Write bytes from b.
Returns number of bytes written (<= len(b)).
Parameters: b – (memoryview) Buffer with data to write.
-
close()[source]¶ Complete the upload and close this stream.
This method has no effect if the stream is already closed.
Raises: Any error encountered by the uploader.
-
closed¶
-
fileno()¶ Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
-
flush()¶ Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
-
isatty()¶ Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
-
read()¶
-
readable()¶ Return whether object was opened for reading.
If False, read() will raise OSError.
-
readall()¶ Read until EOF, using multiple read() call.
-
readinto()¶
-
readline()¶ Read and return a line from the stream.
If size is specified, at most size bytes will be read.
The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.
-
readlines()¶ Return a list of lines from the stream.
hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.
-
seek()¶ Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
- 0 – start of stream (the default); offset should be zero or positive
- 1 – current stream position; offset may be negative
- 2 – end of stream; offset is usually negative
Return the new absolute position.
-
seekable()¶ Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
-
truncate()¶ Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
-
writelines()¶ Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
-
class
apache_beam.io.filesystemio.PipeStream(recv_pipe)[source]¶ Bases:
objectA class that presents a pipe connection as a readable stream.
Not thread-safe.
Remembers the last
sizebytes read and allows rewinding the stream by that amount exactly. See BEAM-6380 for more.-
read(size)[source]¶ Read data from the wrapped pipe connection.
Parameters: size – Number of bytes to read. Actual number of bytes read is always equal to size unless EOF is reached. Returns: data read as str.
-