Turi Create
4.0
|
#include <toolkits/ml_data_2/indexing/column_indexer.hpp>
Public Member Functions | |
column_indexer () | |
virtual void | initialize ()=0 |
virtual size_t | map_value_to_index (size_t thread_idx, const flexible_type &feature)=0 |
virtual size_t | immutable_map_value_to_index (const flexible_type &feature) const =0 |
virtual void | insert_values_into_index (const std::vector< flexible_type > &features) |
virtual void | finalize ()=0 |
virtual flexible_type | map_index_to_value (size_t idx) const |
virtual std::set< flex_type_enum > | extract_key_types () const |
virtual size_t | indexed_column_size () const =0 |
virtual size_t | get_version () const =0 |
virtual void | save_impl (turi::oarchive &oarc) const =0 |
virtual void | load_version (turi::iarchive &iarc, size_t version)=0 |
virtual std::function< flexible_type(const flexible_type &)> | deindexing_lambda () const =0 |
virtual std::function< flexible_type(const flexible_type &)> | indexing_lambda () const =0 |
virtual std::shared_ptr< column_indexer > | create_cleared_copy () const =0 |
virtual void | set_values (std::vector< flexible_type > &&values)=0 |
Static Public Member Functions | |
static std::shared_ptr< column_indexer > | factory_create (const std::map< std::string, variant_type > &creation_options) |
Public Attributes | |
std::string | column_name |
ml_column_mode | mode |
flex_type_enum | original_column_type |
std::map< std::string, flexible_type > | options |
COMMENT.
column_metadata contains "meta data" concerning indexing of a single column of an SFrame. A collection of meta_data column objects is "all" the metadata required in the ml_data container.
Definition at line 27 of file column_indexer.hpp.
|
inline |
Default constructor.
Definition at line 33 of file column_indexer.hpp.
|
pure virtual |
Create a copy with the index cleared.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Returns a lambda function that can be used as a lambda function for deindexing a column.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
inlinevirtual |
Calculates the type of the values held in the index. This may be different from original_column_type – if the original_column_type is a DICT or LIST, this will return the actual type of the values. If the values are inconsistent, then an error is raised.
This method is useful when a metadata built with a dictionary is also used to map simple categorical variables.
Reimplemented in turi::v2::ml_data_internal::column_unique_indexer.
Definition at line 108 of file column_indexer.hpp.
|
static |
The factory method for loading and instantiating the proper class
|
pure virtual |
Call this when all calls to map_value_to_index are completed.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Returns the current version used for the serialization.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Returns the index associated with the "feature" value.
If the value in the feature column was already seen, then the index already associated with that value is returned. If not, size_t(-1) is returned.
[in] | feature | The value in the feature column to map to the index. |
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Returns the size of the column – e.g. the number of distinct categories, or the size of the hash space. Only called if the column is indeed indexed, i.e. if mode_is_indexed(mode) is true.
Categorical : # Unique categories
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Returns a lambda function that can be used as a lambda function for indexing a column.
Does not add any new index values.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Initialize the index mapping and setup. There are certain internal parallel things that need to be set up before map_value_to_index works. Call this before looping over map_value_to_index, then call finish_indexing() when done.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
inlinevirtual |
Some of the ml_data tests currently depend on the order of insertion into the index, which is now done in parallel and thus not deterministic. This function allows the user to remove that randomness by inserting all indices in a specified order.
NOTE: This function is not thread safe; only call it from one thread.
Reimplemented in turi::v2::ml_data_internal::column_unique_indexer.
Definition at line 81 of file column_indexer.hpp.
|
pure virtual |
Load the object.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
inlinevirtual |
Returns the feature "value" associated an index.
Reimplemented in turi::v2::ml_data_internal::column_unique_indexer.
Definition at line 94 of file column_indexer.hpp.
|
pure virtual |
Returns the index associated with the "feature" value.
If the value in the feature column was already seen, then the index already associated with that value is returned. If not, a new unique index is added and associated with this feature value.
This method is completely threadsafe and is meant to be called by multiple threads in contention.
[in] | feature | The value in the feature column to map to the index. |
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Serialize the object (save).
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
|
pure virtual |
Set data directly.
Implemented in turi::v2::ml_data_internal::column_unique_indexer.
std::string turi::v2::ml_data_internal::column_indexer::column_name |
The name of the column.
Definition at line 174 of file column_indexer.hpp.
ml_column_mode turi::v2::ml_data_internal::column_indexer::mode |
The mode of the column;
Definition at line 178 of file column_indexer.hpp.
std::map<std::string, flexible_type> turi::v2::ml_data_internal::column_indexer::options |
A map of the options passed in to ml_data. May include options for the indexers.
Definition at line 187 of file column_indexer.hpp.
flex_type_enum turi::v2::ml_data_internal::column_indexer::original_column_type |
Original column type
Definition at line 182 of file column_indexer.hpp.