Turi Create
4.0
|
#include <core/storage/sframe_data/sarray.hpp>
Public Types | |
typedef sarray_reader< T > | reader_type |
The reader type. | |
typedef swriter_impl::output_iterator< T > | iterator |
The iterator type which get_output_iterator returns. | |
typedef T | value_type |
The type contained in the sarray. | |
Public Member Functions | |
sarray ()=default | |
sarray (sarray &&other) | |
Move constructor. | |
sarray (const sarray &other) | |
Copy constructor. | |
sarray & | operator= (const sarray &other) |
Assignment operator. | |
sarray & | operator= (sarray &&other) |
sarray (std::string sidx_or_directory) | |
sarray (const flexible_type &value, size_t size, size_t num_segments=SFRAME_DEFAULT_NUM_SEGMENTS, flex_type_enum type=flex_type_enum::UNDEFINED) | |
void | open_for_read (index_file_information info) |
void | open_for_read (std::string sidx_file) |
void | open_for_write (size_t num_segments=SFRAME_DEFAULT_NUM_SEGMENTS, bool disable_padding=false) |
void | open_for_write (std::string sidx_file, size_t num_segments=SFRAME_DEFAULT_NUM_SEGMENTS) |
bool | is_opened_for_read () const |
bool | is_opened_for_write () const |
std::string | get_index_file () const |
sarray_group_format_writer< T > * | get_writer () |
bool | get_metadata (std::string key, std::string &val) const |
std::pair< bool, std::string > | get_metadata (std::string key) const |
size_t | size () const |
std::unique_ptr< reader_type > | get_reader () const |
std::unique_ptr< reader_type > | get_reader (size_t num_segments) const |
std::unique_ptr< reader_type > | get_reader (const std::vector< size_t > &segment_lengths) const |
size_t | num_segments () const |
size_t | segment_length (size_t i) const |
const index_file_information | get_index_info () const |
sarray | append (const sarray &other) const |
std::shared_ptr< sarray > | clone (size_t nsegments=0) const |
void | save (oarchive &oarc) const |
void | load (iarchive &iarc) |
void | try_compact () |
bool | set_num_segments (size_t numseg) |
iterator | get_output_iterator (size_t segmentid) |
void | close () |
bool | set_metadata (std::string key, std::string val) |
flex_type_enum | get_type () const |
void | set_type (flex_type_enum type) |
void | set_segment (size_t segmentid, const std::string &segment_file, size_t segment_size) |
void | save (std::string index_file) const |
template<> | |
sarray< flexible_type >::iterator | get_output_iterator (size_t segmentid) |
Gets an output iterator to the specified segment. | |
The SArray represents an immutable, on disk, sequence of objects T.
The SArray is an immutable sequence of objects of type T, and is internally represented as a collection of files. The sequence is cut up into a collection of segments (not necessarily of equal length), where each segment covers a disjoint subset of the sequence. Each segment can then be read in parallel. SArray is referenced on disk by a single ".sidx" file, which then has a list of file names, one file for each segment.
The SArray is write-once, read-many. The SArray can be opened for writing once, after which it is read-only.
To open an existing sarray on disk for reading:
Note that the type of the array on disk is NOT checked. (though, we probably should)
To open an sarray for writing:
When the array is opened for writing, it can written into using get_output_iterator() , to get an output iterator into each segment.
The get_output_iterator() function can be called concurrently, but each individual output iterator is not concurrent. After close() is called, the sarray becomes a read-only array, and is equivalent to having called array.open_for_read(...)
To read from the sarray, get_reader() is used.
Each reader provides read access to the SArray. Multiple readers can be obtained, as each has its own distinct file handles which are closed as the reader goes out of scope. See the documentation for sarray_reader for details.
The sarray<flexible_type> has additional capabilities.
Definition at line 30 of file gl_sarray.hpp.
|
default |
default constructor; does nothing; use open_for_read or open_for_write after construction to read/create an sarray.
|
inlineexplicit |
Attempts to construct an sarray which reads from an sfrom the given file index file. If the index cannot be opened, an exception is thrown.
Definition at line 198 of file sarray.hpp.
|
inline |
Create an sarray of given value and size.
Definition at line 205 of file sarray.hpp.
|
inline |
Appends another SArray of the same type with the current SArray, returning a new sarray. without destroying the other array. Both SArrays can be empty, but cannot be opened for writing.
Definition at line 458 of file sarray.hpp.
|
inline |
Return a new sarray that contains a copy of the data in the current array.
Definition at line 488 of file sarray.hpp.
|
inlinevirtual |
Closes the array. Array must be first opened for writing. close() also implicitly closes all segments. After the writer is closed, no segments can be written. Only once the array is closed, the SArray becomes readable with the get_reader() function.
Implements turi::swriter_base< swriter_impl::output_iterator< T > >.
Definition at line 605 of file sarray.hpp.
|
inline |
Return the location of the index file of the sarray
Definition at line 340 of file sarray.hpp.
|
inline |
Returns all the index information of the array.
Definition at line 448 of file sarray.hpp.
|
inline |
Reads the value of a key associated with the sarray. Returns true on success, false on failure.
Definition at line 359 of file sarray.hpp.
|
inline |
Reads the value of a key associated with the sarray. Returns a pair of (true, value) on success, and (false, empty_string) on failure.
Definition at line 370 of file sarray.hpp.
|
inlinevirtual |
Return an output iterator which can be used to write data to the segment. Array must be first opened for writing. The iterator (iterator) is of the output iterator type and has value_type T.
The iterator is invalid once the segment is closed (See close). Accessing the iterator after the writer is destroyed is undefined behavior.
Will throw an exception if the array is invalid (there is an error opening/writing files) Also segmentid must be a valid segment ID. Will throw an exception otherwise.
When T is a flexible_type, the output iterator performs type checking.
Implements turi::swriter_base< swriter_impl::output_iterator< T > >.
Definition at line 756 of file sarray.hpp.
|
inline |
Gets an sarray reader object using the segmentation produced by the actual file segments on disk.
Definition at line 396 of file sarray.hpp.
|
inline |
Gets an sarray reader object with num_segments number of logical segments.
Definition at line 407 of file sarray.hpp.
|
inline |
Gets an sarray reader object with a custom segment layout. segment_lengths must sum up to the same length as the original array.
Definition at line 419 of file sarray.hpp.
|
inline |
Returns the type of the SArray (as set by swriter<flexible_type>::set_type). If the type of the SArray was not set, this returns flex_type_enum::UNDEFINED, in which case each row can be of arbitrary type.
This function should only be used for sarray<flexible_type> and will fail fatally otherwise.
Definition at line 643 of file sarray.hpp.
|
inline |
Return the underlying writer of the sarray
Definition at line 348 of file sarray.hpp.
|
inline |
Returns true if the Array is opened for reading. i.e. get_reader() will succeed
Definition at line 324 of file sarray.hpp.
|
inline |
Returns true if the Array is opened for writing. i.e. get_output_iterator() will succeed
Definition at line 333 of file sarray.hpp.
|
inline |
SArray deserializer. iarc must be associated with a directory. Loads from the next prefix inside the directory.
Definition at line 523 of file sarray.hpp.
|
inlinevirtual |
Return the number of segments in the array.
Implements turi::swriter_base< swriter_impl::output_iterator< T > >.
Definition at line 431 of file sarray.hpp.
|
inline |
Initializes the SArray with an index info. If the SArray is already inited, this will throw an exception
Definition at line 232 of file sarray.hpp.
|
inline |
Initializes the SArray with an index file. If the SArray is already inited, this will throw an exception
Definition at line 253 of file sarray.hpp.
|
inline |
Opens the Array for writing with an arbitrary temporary file. The array must not already been inited.
num_segments | The number of segments in the array |
Definition at line 280 of file sarray.hpp.
|
inline |
Opens the Array for writing with a location on disk. The array must not already been inited.
sidx_file | If not specified, an argitrary temporary file will be created. Otherwise, all frame files will be written to the same location as the frame_sidx_file. Must end in ".sidx" |
num_segments | The number of segments in the array |
Definition at line 307 of file sarray.hpp.
|
inline |
Move assignment. Moves other into this. Other will be cleared as if it is a newly constructed sarray object.
Definition at line 173 of file sarray.hpp.
|
inline |
Sarray serializer. iarc must be associated with a directory. Saves into a prefix inside the directory.
Definition at line 514 of file sarray.hpp.
|
inline |
Saves a copy of the current sarray into a different location. Does not modify the current sarray.
Definition at line 693 of file sarray.hpp.
|
inline |
Return the length of segment i in the array.
Definition at line 440 of file sarray.hpp.
|
inline |
Adds meta data to the array. Array must be first opened for writing.
Definition at line 619 of file sarray.hpp.
|
inlinevirtual |
Sets the number of segments in the output. Array must be first opened for writing. If any writes has occured prior to this, those writes will be lost. Returns true on sucess, false on failure.
Implements turi::swriter_base< swriter_impl::output_iterator< T > >.
Definition at line 552 of file sarray.hpp.
|
inline |
Set the writer index_info for a given segment. This function can be called, when the actual segment writing is done by other logics.
Definition at line 681 of file sarray.hpp.
|
inline |
Sets the internal type of the flexible_type when written. All writes will cast to this type.
This function should only be used for sarray<flexible_type> and will fail fatally otherwise.
Definition at line 664 of file sarray.hpp.
|
inline |
Returns the number of elements in the SArray
Definition at line 382 of file sarray.hpp.
|
inline |
Attempts to compact if the number of segments in the SArray exceeds SFRAME_COMPACTION_THRESHOLD.
Definition at line 532 of file sarray.hpp.