Turi Create  4.0
turi::sframe_reader Class Referenceabstract

#include <core/storage/sframe_data/sframe_reader.hpp>

Public Types

typedef sframe_iterator iterator
 The iterator type which begin and end returns.
 
typedef sframe_iterator::value_type value_type
 The value type the sframe stores.
 

Public Member Functions

 sframe_reader ()=default
 
 sframe_reader (const sframe_reader &other)=delete
 Deleted Copy constructor.
 
sframe_readeroperator= (const sframe_reader &other)=delete
 Deleted Assignment operator.
 
void init (const sframe &array, size_t num_segments=(size_t)(-1))
 
void init (const sframe &array, const std::vector< size_t > &segment_lengths)
 
iterator begin (size_t segmentid) const
 Return the begin iterator of the segment.
 
iterator end (size_t segmentid) const
 Return the end iterator of the segment.
 
size_t read_rows (size_t row_start, size_t row_end, std::vector< std::vector< flexible_type > > &out_obj)
 
size_t read_rows (size_t row_start, size_t row_end, sframe_rows &out_obj)
 
void reset_iterators ()
 
size_t num_columns () const
 Returns the number of columns in the SFrame. Does not throw.
 
size_t num_rows () const
 Returns the length of each sarray.
 
size_t size () const
 Returns the length of each sarray.
 
std::string column_name (size_t i) const
 
flex_type_enum column_type (size_t i) const
 
size_t num_segments () const
 Returns the number of segments in the SFrame. Does not throw.
 
size_t segment_length (size_t segment) const
 
bool contains_column (const std::string &column_name) const
 
size_t column_index (const std::string &column_name) const
 
virtual size_t read_rows (size_t row_start, size_t row_end, std::vector< typename sframe_iterator ::value_type > &out_obj)=0
 

Detailed Description

The sframe_reader provides a reading interface to an sframe : an immutable on-disk set of columns, each with their own type. These types are represented as a flexible_type.

The SFrame is represented as an ordered set of SArrays, each with an enforcable name and type. Each SArray in an SFrame must have the same number of segments as all other SArrays in the SFrame, which each must have the same number of elements as all other segments. A segment of an SFrame is a disjoint subset of rows with an entry from each column. Segments can be read in parallel.

To read from an sframe use sframe::get_reader():

auto reader = frame.get_reader();

reader will be of type sframe_reader

reader can then provide input iterators from segments via the begin() and end() functions.

Definition at line 204 of file sframe_reader.hpp.

Constructor & Destructor Documentation

◆ sframe_reader()

turi::sframe_reader::sframe_reader ( )
default

Constructs an empty sframe.

Member Function Documentation

◆ column_index()

size_t turi::sframe_reader::column_index ( const std::string &  column_name) const
inline

Returns the column index of column_name.

Throws an exception of the column_ does not exist.

Definition at line 364 of file sframe_reader.hpp.

◆ column_name()

std::string turi::sframe_reader::column_name ( size_t  i) const
inline

Returns the name of the given column. Throws an exception if the column id is out of range.

Definition at line 318 of file sframe_reader.hpp.

◆ column_type()

flex_type_enum turi::sframe_reader::column_type ( size_t  i) const
inline

Returns the type of the given column. Throws an exception if the column id is out of range.

Definition at line 328 of file sframe_reader.hpp.

◆ contains_column()

bool turi::sframe_reader::contains_column ( const std::string &  column_name) const
inline

Returns true if the sframe contains the given column.

Definition at line 353 of file sframe_reader.hpp.

◆ init() [1/2]

void turi::sframe_reader::init ( const sframe array,
size_t  num_segments = (size_t)(-1) 
)

Attempts to construct an sframe_iterator which reads If the index file cannot be opened, an exception is thrown.

Parameters
arrayThe array to read
num_segmentsIf num_segments == (size_t)(-1), the segmentation of the first column is used. Otherwise, the array is cut into num_segments number of logical segments which distribute the rows uniformly.

◆ init() [2/2]

void turi::sframe_reader::init ( const sframe array,
const std::vector< size_t > &  segment_lengths 
)

Attempts to construct an sframe_iterator which reads from an existing sframe and uses a segmentation defined by an argument. If the index file cannot be opened, an exception is thrown. If the sum of the lengths of all the segments do not add up to the length of the sframe , an exception is thrown

Parameters
arrayThe frame to read
segment_lengthsAn array describing the lengths of each segment. This must sum up to the length of the array.

◆ read_rows() [1/3]

virtual size_t turi::siterable< sframe_iterator >::read_rows ( size_t  row_start,
size_t  row_end,
std::vector< typename sframe_iterator ::value_type > &  out_obj 
)
pure virtualinherited

Reads a collection of rows, storing the result in out_obj. This function is independent of the begin/end iterator functions, and can be called anytime. This function is also fully concurrent.

Parameters
row_startFirst row to read
row_endone past the last row to read (i.e. EXCLUSIVE). row_end can be beyond the end of the array, in which case, fewer rows will be read.
out_objThe output array
Returns
Actual number of rows read. Return (size_t)(-1) on failure.
Note
This function is not always efficient. Different file formats implementations will have different characteristics.

◆ read_rows() [2/3]

size_t turi::sframe_reader::read_rows ( size_t  row_start,
size_t  row_end,
std::vector< std::vector< flexible_type > > &  out_obj 
)

Reads a collection of rows, storing the result in out_obj. This function is independent of the begin/end iterator functions, and can be called anytime. This function is also fully concurrent.

Parameters
row_startFirst row to read
row_endone past the last row to read (i.e. EXCLUSIVE). row_end can be beyond the end of the array, in which case, fewer rows will be read.
out_objThe output array
Returns
Actual number of rows read. Return (size_t)(-1) on failure.
Note
This function is not always efficient. Different file formats implementations will have different characteristics.

◆ read_rows() [3/3]

size_t turi::sframe_reader::read_rows ( size_t  row_start,
size_t  row_end,
sframe_rows out_obj 
)

Reads a collection of rows, storing the result in out_obj. This function is independent of the begin/end iterator functions, and can be called anytime. This function is also fully concurrent.

Parameters
row_startFirst row to read
row_endone past the last row to read (i.e. EXCLUSIVE). row_end can be beyond the end of the array, in which case, fewer rows will be read.
out_objThe output array
Returns
Actual number of rows read. Return (size_t)(-1) on failure.
Note
This function is not always efficient. Different file formats implementations will have different characteristics.

◆ reset_iterators()

void turi::sframe_reader::reset_iterators ( )
virtual

Resets all the file handles. All existing iterators are invalidated.

Implements turi::siterable< sframe_iterator >.

◆ segment_length()

size_t turi::sframe_reader::segment_length ( size_t  segment) const
inlinevirtual

Returns the length of the given segment. Throws an exception if the segment id is out of range.

Implements turi::siterable< sframe_iterator >.

Definition at line 343 of file sframe_reader.hpp.


The documentation for this class was generated from the following file: