Turi Create
4.0
|
#include <core/storage/sframe_data/sarray_reader.hpp>
Public Types | |
typedef sarray_iterator< T > | iterator |
The iterator type which begin and end returns. | |
typedef iterator::value_type | value_type |
The value type the sarray stores. | |
Public Member Functions | |
sarray_reader ()=default | |
sarray_reader (const sarray_reader &other)=delete | |
Deleted Copy constructor. | |
sarray_reader & | operator= (const sarray_reader &other)=delete |
Assignment operator. | |
void | init (const sarray< T > &array, size_t num_segments=(size_t)(-1)) |
void | init (const sarray< T > &array, const std::vector< size_t > &segment_lengths) |
size_t | num_segments () const |
size_t | segment_length (size_t segment) const |
std::string | get_index_file () const |
std::vector< std::string > | get_file_names () const |
bool | get_metadata (std::string key, std::string &val) const |
std::pair< bool, std::string > | get_metadata (std::string key) const |
size_t | size () const |
iterator | begin (size_t segmentid) const |
iterator | end (size_t segmentid) const |
size_t | read_rows (size_t row_start, size_t row_end, std::vector< T > &out_obj) |
size_t | read_rows (size_t row_start, size_t row_end, sframe_rows &out_obj) |
void | reset_iterators () |
flex_type_enum | get_type () const |
virtual size_t | read_rows (size_t row_start, size_t row_end, std::vector< typename sarray_iterator< T > ::value_type > &out_obj)=0 |
The SArray reader provides a reading interface to an immutable, on disk, sequence of objects T.
The SArray is an immutable sequence of objects of type T, and is internally represented as a collection of files. The sequence is cut up into a collection of segments (not necessarily of equal length), where each segment covers a disjoint subset of the sequence. Each segment can then be read in parallel.
To read from an sarray<T> use sarray::get_reader():
reader will be of type sarray_reader<T>
reader can then provide input iterators from segments via the begin() and end() functions.
Definition at line 33 of file gl_sarray.hpp.
|
default |
Default constructor. Does nothing. Use init()
|
inlinevirtual |
Return the begin iterator of the segment. The iterator (sarray_iterator) is of the input iterator type and has value_type T. See end() to get the end iterator of the segment.
The iterator is invalid once the originating sarray is destroyed. Accessing the iterator after the sarray is destroyed is undefined behavior.
Will throw an exception if the sarray is invalid (there is an error reading files) Also segmentid must be a valid segment ID. Will throw an exception otherwise.
Implements turi::siterable< sarray_iterator< T > >.
Definition at line 396 of file sarray_reader.hpp.
|
inlinevirtual |
Return the end iterator of the segment. The iterator (sarray_iterator) is of the input iterator type and has value_type T. See end() to get the end iterator of the segment.
The iterator is invalid once the originating sarray is destroyed. Accessing the iterator after the sarray is destroyed is undefined behavior.
Will throw an exception if the sarray is invalid (there is an error reading files) Also segmentid must be a valid segment ID. Will throw an exception otherwise.
Implements turi::siterable< sarray_iterator< T > >.
Definition at line 429 of file sarray_reader.hpp.
|
inline |
Returns the collection of files storing the sarray. For instance: [file_prefix].sidx, [file_prefix].0001, etc.
Definition at line 327 of file sarray_reader.hpp.
|
inline |
Return the file prefix of the sarray (paramter on construction)
Definition at line 317 of file sarray_reader.hpp.
|
inline |
Reads the value of a key associated with the sarray. Returns true on success, false on failure.
Definition at line 336 of file sarray_reader.hpp.
|
inline |
Reads the value of a key associated with the sarray. Returns a pair of (true, value) on success, and (false, empty_string) on failure.
Definition at line 349 of file sarray_reader.hpp.
|
inline |
Returns the type of the SArray (as set by swriter<flexible_type>::set_type). If the type of the SArray was not set, this returns flex_type_enum::UNDEFINED, in which case each row can be of arbitrary type.
This function should only be used for sarray<flexible_type> and will fail fatally otherwise.
Definition at line 497 of file sarray_reader.hpp.
|
inline |
Attempts to construct an sarray_iterator which reads from an existing sarray. If the index file cannot be opened, an exception is thrown.
array | The array to read |
num_segments | If num_segments == (size_t)(-1), the original file segmentation is used. Otherwise, the array is cut into num_segments number of logical segments which distribute the rows uniformly. |
Definition at line 227 of file sarray_reader.hpp.
|
inline |
Attempts to construct an sarray_iterator which reads from an existing sarray and uses a segmentation defined by an argument. If the index file cannot be opened, an exception is thrown. If the sum of the lengths of all the segments do not add up to the length of the sarray, an exception is thrown
array | The array to read |
segment_lengths | An array describing the lengths of each segment. This must sum up to the length of the array. |
Definition at line 271 of file sarray_reader.hpp.
|
inlinevirtual |
Return the number of segments in the collection. Will throw an exception if the sarray is invalid (there is an error reading files)
Implements turi::siterable< sarray_iterator< T > >.
Definition at line 299 of file sarray_reader.hpp.
|
pure virtualinherited |
Reads a collection of rows, storing the result in out_obj. This function is independent of the begin/end iterator functions, and can be called anytime. This function is also fully concurrent.
row_start | First row to read |
row_end | one past the last row to read (i.e. EXCLUSIVE). row_end can be beyond the end of the array, in which case, fewer rows will be read. |
out_obj | The output array |
|
inline |
Reads a collection of rows, storing the result in out_obj. This function is independent of the open_segment/read_segment/close_segment functions, and can be called anytime. This function is also fully concurrent.
row_start | First row to read |
row_end | one past the last row to read (i.e. EXCLUSIVE). row_end can be beyond the end of the array, in which case, fewer rows will be read. |
out_obj | The output array |
Definition at line 451 of file sarray_reader.hpp.
|
inlinevirtual |
Resets all the file handles. All existing iterators are invalidated.
Implements turi::siterable< sarray_iterator< T > >.
Definition at line 481 of file sarray_reader.hpp.
|
inlinevirtual |
Return the number of rows in the segment. Will throw an exception if the sarray is invalid (there is an error reading files)
Implements turi::siterable< sarray_iterator< T > >.
Definition at line 309 of file sarray_reader.hpp.
|
inline |
Returns the number of elements in the SArray
Definition at line 363 of file sarray_reader.hpp.