apache_beam.io.filesystemio module¶
Utilities for FileSystem
implementations.
-
class
apache_beam.io.filesystemio.
Downloader
[source]¶ Bases:
object
Download interface for a single file.
Implementations should support random access reads.
-
size
¶ Size of file to download.
-
get_range
(start, end)[source]¶ Retrieve a given byte range [start, end) from this download.
- Range must be in this form:
- 0 <= start < end: Fetch the bytes from start to end.
Parameters: - start – (int) Initial byte offset.
- end – (int) Final byte offset, exclusive.
Returns: (string) A buffer containing the requested data.
-
-
class
apache_beam.io.filesystemio.
Uploader
[source]¶ Bases:
object
Upload interface for a single file.
-
class
apache_beam.io.filesystemio.
DownloaderStream
(downloader, read_buffer_size=8192, mode='rb')[source]¶ Bases:
io.RawIOBase
Provides a stream interface for Downloader objects.
Initializes the stream.
Parameters: - downloader – (Downloader) Filesystem dependent implementation.
- read_buffer_size – (int) Buffer size to use during read operations.
- mode – (string) Python mode attribute for this stream.
-
readinto
(b)[source]¶ Read up to len(b) bytes into b.
Returns number of bytes read (0 for EOF).
Parameters: b – (bytearray/memoryview) Buffer to read into.
-
seek
(offset, whence=0)[source]¶ Set the stream’s current offset.
Note if the new offset is out of bound, it is adjusted to either 0 or EOF.
Parameters: - offset – seek offset as number.
- whence – seek mode. Supported modes are os.SEEK_SET (absolute seek), os.SEEK_CUR (seek relative to the current position), and os.SEEK_END (seek relative to the end, offset should be negative).
Raises: ValueError
– When this stream is closed or if whence is invalid.
-
tell
()[source]¶ Tell the stream’s current offset.
Returns: current offset in reading this stream. Raises: ValueError
– When this stream is closed.
-
close
()¶ Flush and close the IO object.
This method has no effect if the file is already closed.
-
closed
¶
-
fileno
()¶ Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
-
flush
()¶ Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
-
isatty
()¶ Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
-
read
()¶
-
readline
()¶ Read and return a line from the stream.
If size is specified, at most size bytes will be read.
The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.
-
readlines
()¶ Return a list of lines from the stream.
hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.
-
truncate
()¶ Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
-
writable
()¶ Return whether object was opened for writing.
If False, write() will raise OSError.
-
write
()¶
-
writelines
()¶ Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
-
class
apache_beam.io.filesystemio.
UploaderStream
(uploader, mode='wb')[source]¶ Bases:
io.RawIOBase
Provides a stream interface for Uploader objects.
Initializes the stream.
Parameters: - uploader – (Uploader) Filesystem dependent implementation.
- mode – (string) Python mode attribute for this stream.
-
write
(b)[source]¶ Write bytes from b.
Returns number of bytes written (<= len(b)).
Parameters: b – (memoryview) Buffer with data to write.
-
close
()[source]¶ Complete the upload and close this stream.
This method has no effect if the stream is already closed.
Raises: Any error encountered by the uploader.
-
closed
¶
-
fileno
()¶ Returns underlying file descriptor if one exists.
OSError is raised if the IO object does not use a file descriptor.
-
flush
()¶ Flush write buffers, if applicable.
This is not implemented for read-only and non-blocking streams.
-
isatty
()¶ Return whether this is an ‘interactive’ stream.
Return False if it can’t be determined.
-
read
()¶
-
readable
()¶ Return whether object was opened for reading.
If False, read() will raise OSError.
-
readall
()¶ Read until EOF, using multiple read() call.
-
readinto
()¶
-
readline
()¶ Read and return a line from the stream.
If size is specified, at most size bytes will be read.
The line terminator is always b’n’ for binary files; for text files, the newlines argument to open can be used to select the line terminator(s) recognized.
-
readlines
()¶ Return a list of lines from the stream.
hint can be specified to control the number of lines read: no more lines will be read if the total size (in bytes/characters) of all lines so far exceeds hint.
-
seek
()¶ Change stream position.
Change the stream position to the given byte offset. The offset is interpreted relative to the position indicated by whence. Values for whence are:
- 0 – start of stream (the default); offset should be zero or positive
- 1 – current stream position; offset may be negative
- 2 – end of stream; offset is usually negative
Return the new absolute position.
-
seekable
()¶ Return whether object supports random access.
If False, seek(), tell() and truncate() will raise OSError. This method may need to do a test seek().
-
truncate
()¶ Truncate file to size bytes.
File pointer is left unchanged. Size defaults to the current IO position as reported by tell(). Returns the new size.
-
writelines
()¶ Write a list of lines to stream.
Line separators are not added, so it is usual for each of the lines provided to have a line separator at the end.
-
class
apache_beam.io.filesystemio.
PipeStream
(recv_pipe)[source]¶ Bases:
object
A class that presents a pipe connection as a readable stream.
Not thread-safe.
Remembers the last
size
bytes read and allows rewinding the stream by that amount exactly. See BEAM-6380 for more.-
read
(size)[source]¶ Read data from the wrapped pipe connection.
Parameters: size – Number of bytes to read. Actual number of bytes read is always equal to size unless EOF is reached. Returns: data read as str.
-