Turi Create  4.0
turi::v2::ml_data_row_reference Class Reference

#include <toolkits/ml_data_2/iterators/row_reference.hpp>

Public Member Functions

template<typename Entry >
GL_HOT_INLINE void fill (std::vector< Entry > &x) const
 
void fill_untranslated_values (std::vector< flexible_type > &x) const GL_HOT_INLINE_FLATTEN
 
void fill (SparseVector &x) const GL_HOT_INLINE_FLATTEN
 
void fill (DenseVector &x) const GL_HOT_INLINE_FLATTEN
 
template<typename DenseRowXpr >
GL_HOT_INLINE_FLATTEN void fill_eigen_row (DenseRowXpr &&x) const
 
double target_value () const GL_HOT_INLINE_FLATTEN
 
size_t target_index () const GL_HOT_INLINE_FLATTEN
 
const std::shared_ptr< ml_metadata > & metadata () const
 

Detailed Description

A class containing a reference to the row of an ml_data instance. The row can then be used to fill any sort of data row that an iterator can be used to fill.

In other words,

it.fill_observation(x);

Can be replaced with

auto row_ref = it.get_reference();

// do stuff ... row_ref.fill(x);

The data block pointed to by this reference is kept alive as long as this reference class exists.

Another example of how it is used is below:

sframe X = make_integer_testing_sframe( {"C1", "C2"}, { {0, 0}, {1, 1}, {2, 2}, {3, 3}, {4, 4} } );

v2::ml_data data;

data.fill(X);

// Get row references

std::vector<v2::ml_data_row_reference> rows(data.num_rows());

for(auto it = data.get_iterator(); !it.done(); ++it) { rows[it.row_index()] = it.get_reference(); }

// Now go through and make sure that each of these hold the // correct answers.

std::vector<v2::ml_data_entry> x;

for(size_t i = 0; i < rows.size(); ++i) {

// The metadata for the row is the same as that in the data. ASSERT_TRUE(rows[i].metadata().get() == data.metadata().get());

rows[i].fill(x);

ASSERT_EQ(x.size(), 2);

ASSERT_EQ(x[0].column_index, 0); ASSERT_EQ(x[0].index, 0); ASSERT_EQ(x[0].value, i);

ASSERT_EQ(x[1].column_index, 1); ASSERT_EQ(x[1].index, 0); ASSERT_EQ(x[1].value, i); } }

Definition at line 87 of file row_reference.hpp.

Member Function Documentation

◆ fill() [1/3]

template<typename Entry >
GL_HOT_INLINE void turi::v2::ml_data_row_reference::fill ( std::vector< Entry > &  x) const
inline

Fill an observation vector, represented as an ml_data_entry struct. (column_index, index, value) pairs, from this row reference. For each column:

Categotical: Returns (col_id, v, 1) Numeric : Returns (col_id, 0, v) Vector : Returns (col_id, i, v) for each (i,v) in vector.

Example use is given by the following code:

std::vector<ml_data_entry> x;

row_ref.fill(x); double y = row_ref.target_value(); ...

Definition at line 109 of file row_reference.hpp.

◆ fill() [2/3]

void turi::v2::ml_data_row_reference::fill ( SparseVector &  x) const
inline

Fill an observation vector, represented as an Eigen Sparse Vector, from the current location in the iteration.

Note
A reference category is used in this version of the function.
For performance reasons, this function does not check for new categories during predict time. That must be checked externally.

This function returns a flattened version of the vector provided by the std::pair version of fill.

Example

Warning
This only works when the SFrame is "mapped" to integer keys.

For a dataset with a 3 column SFrame

Row 1: 1.0 0(categorical) <9.1, 2.4> Row 2: 2.0 1(categorical) <1.0, 4.5>

with index = {1,2,2}

the SparseVector format would return

Row 1: < (0, 1.0), (1, 1) ,(3, 9.1) ,(4, 2.4)> Row 2: < (0, 2.0), (2, 1) ,(3, 1.0) ,(4, 4.5)>

Note
The '0'th category is used as reference.
Parameters
[in,out]xData containing everything!

Definition at line 176 of file row_reference.hpp.

◆ fill() [3/3]

void turi::v2::ml_data_row_reference::fill ( DenseVector &  x) const
inline

Fill an observation vector, represented as an Eigen Dense Vector, from the current location in the iteration.

Note
The 0th category is used as a reference category.
For performance reasons, this function does not check for new categories during predict time. That must be checked externally.

This function returns a flattened version of the vector provided by the std::pair version of fill.

Example

Warning
This only works when the SFrame is "mapped" to intger keys.

For a dataset with a 3 column SFrame

Row 1: 1.0 0(categorical) <9.1, 2.4> Row 2: 2.0 1(categorical) <1.0, 4.5>

with index = {1,2,2}

the DenseVector format would return

Row 1: <1.0, 0, 1, 9.1, 2.4> Row 2: <2.0, 1, 0, 1.0, 4.5>

Parameters
[in,out]xData containing everything!

Definition at line 222 of file row_reference.hpp.

◆ fill_eigen_row()

template<typename DenseRowXpr >
GL_HOT_INLINE_FLATTEN void turi::v2::ml_data_row_reference::fill_eigen_row ( DenseRowXpr &&  x) const
inline

Fill a row of an Eigen Dense Vector, from the current location in the iteration.

Note
The 0th category is used as a reference category.

Example:

Eigen::MatrixXd X;

...

it.fill_eigen_row(X.row(row_idx));


Parameters
[in,out]xAn eigen row expression.

Definition at line 258 of file row_reference.hpp.

◆ fill_untranslated_values()

void turi::v2::ml_data_row_reference::fill_untranslated_values ( std::vector< flexible_type > &  x) const
inline

Fill an observation vector with the untranslated columns, if any have been specified at setup time. These columns are simply mapped back to their sarray counterparts.

Definition at line 126 of file row_reference.hpp.

◆ metadata()

const std::shared_ptr<ml_metadata>& turi::v2::ml_data_row_reference::metadata ( ) const
inline

Returns a pointer to the metadata class that describes the data that this row reference refers to.

Definition at line 287 of file row_reference.hpp.

◆ target_index()

size_t turi::v2::ml_data_row_reference::target_index ( ) const
inline

Returns the current categorical target index, if present, or 0 if not present.

Definition at line 280 of file row_reference.hpp.

◆ target_value()

double turi::v2::ml_data_row_reference::target_value ( ) const
inline

Returns the current target value, if present, or 1 if not present. If the target column is supposed to be a categorical value, then use categorical_target_index().

Definition at line 273 of file row_reference.hpp.


The documentation for this class was generated from the following file: