Turi Create  4.0
turi::sparse_similarity_lookup Class Referenceabstract

#include <toolkits/sparse_similarity/sparse_similarity_lookup.hpp>

Public Member Functions

virtual std::string similarity_name () const =0
 
virtual std::map< std::string, flexible_typetrain_from_sparse_matrix_sarray (size_t num_items, const std::shared_ptr< sarray< std::vector< std::pair< size_t, double > > > > &data)=0
 
virtual void setup_by_raw_similarity (size_t num_items, const flex_list &item_data, const sframe &_interaction_data, const std::string &item_column, const std::string &similar_item_column, const std::string &similarity, bool add_reverse=false)=0
 
virtual size_t score_items (std::vector< std::pair< size_t, double > > &item_predictions, const std::vector< std::pair< size_t, double > > &user_item_data) const =0
 
virtual void get_similar_items (std::vector< std::pair< size_t, flexible_type > > &similar_items_dest, size_t item, size_t top_k) const =0
 
virtual void save (turi::oarchive &oarc) const =0
 
const std::map< std::string, flexible_type > & current_options () const
 
virtual bool _debug_check_equal (const sparse_similarity_lookup &other) const =0
 

Static Public Member Functions

static void add_options (option_manager &options)
 
static std::shared_ptr< sparse_similarity_lookupcreate (const std::string &similarity, const std::map< std::string, flexible_type > &options)
 

Protected Attributes

std::map< std::string, flexible_typeoptions
 

Detailed Description

A model that can be used for sparse similarity lookup.

A trained version of this model contains a lookup table of the nearest items to each item along with a similarity score. This allows both retrieval of the most similar items to a given item, and to generate a list of the items most similar to a collection of items. The latter is used to recommend items, e.g. by providing a list of the items and ratings for a particular user.

The similarity metrics are an implementation of a class that implements a number of methods dictating the math used in the accumulation. See similarities.hpp for details.

This model is creating using the create(...) methods below, which takes the name of the similarity and the current options. The options given in add_options(...) must be present in the options map.

The model can be trained by either providing the similarities of the items directly, or by training the model on a sarray of user-item-ratings. See the below functions for details.

Code Structure:

  • This model is intended to be encapsulated by other user-facing models such as item similarity. In this case, item similarity provides the user facing API, creates this model and then uses it. Some item cf specific features, like how to handle new users, are punted to that model – this one only can be queried with a list of items and ratings which then produce the output.
  • The similarity class defines the metric used, and then how the averaging at prediction time is done.
  • The similarity class is given as a template to the implementation part of this class, sparse_similarity_lookup_impl, which inherits from this one. This class does the training (if it's not farmed out to different helper functions), and takes care of the prediction and item scoring. Basically, the core of the algorithm is in sparse_similarity_lookup_impl.hpp.
  • A number of accompaning utilities – for example, nearest neighbors functions, utilities to generate the per-item statistics, and
  • A dense matrix class that stores only the upper diaganol part of a matrix is provided in sliced_itemitem_matrix.hpp. Included in that header are tools for estimating the number of row-slices and passes through the data to make given constraints on the memory usage.

Definition at line 70 of file sparse_similarity_lookup.hpp.

Member Function Documentation

◆ _debug_check_equal()

virtual bool turi::sparse_similarity_lookup::_debug_check_equal ( const sparse_similarity_lookup other) const
pure virtual

A method to detect if two sparse_similarity_lookup classes are essentially the same.

◆ add_options()

static void turi::sparse_similarity_lookup::add_options ( option_manager options)
static

Adds in all of the options needed for this class to the option manager.

◆ create()

static std::shared_ptr<sparse_similarity_lookup> turi::sparse_similarity_lookup::create ( const std::string &  similarity,
const std::map< std::string, flexible_type > &  options 
)
static

Factory method: Call this to create or load a model from one of the existing similarities by name.

◆ current_options()

const std::map<std::string, flexible_type>& turi::sparse_similarity_lookup::current_options ( ) const
inline

The current options.

Definition at line 166 of file sparse_similarity_lookup.hpp.

◆ get_similar_items()

virtual void turi::sparse_similarity_lookup::get_similar_items ( std::vector< std::pair< size_t, flexible_type > > &  similar_items_dest,
size_t  item,
size_t  top_k 
) const
pure virtual

Fills an array with the most similar items to a given item. This is read directly from the lookup table. If fewer than top_k items are present in the lookup table, then only those in the lookup table are returned.

◆ save()

virtual void turi::sparse_similarity_lookup::save ( turi::oarchive oarc) const
pure virtual

Serialization in sparse_similarity_lookup_impl.

◆ score_items()

virtual size_t turi::sparse_similarity_lookup::score_items ( std::vector< std::pair< size_t, double > > &  item_predictions,
const std::vector< std::pair< size_t, double > > &  user_item_data 
) const
pure virtual

Score all items in a list of item predictions given a list of user-item interactions. This method fills in the second part of the tuple for all the items in the item_predictions container.

This is also the way to generate values for predict –

Returns the number of item similarity pairs that were considered. (In some corner cases, such as when an item had no users that also rated other items, we want to recommend by popularity or some other metric).

◆ setup_by_raw_similarity()

virtual void turi::sparse_similarity_lookup::setup_by_raw_similarity ( size_t  num_items,
const flex_list item_data,
const sframe _interaction_data,
const std::string &  item_column,
const std::string &  similar_item_column,
const std::string &  similarity,
bool  add_reverse = false 
)
pure virtual

Sets the lookup tables directly from an sframe of interaction data. The interaction data is an sframe containing columns item_column, similar_item_column, and similarity. The items and similar items must be indices in {0, ..., num_items-1}.

An error is raised if the similarity value does not conform to the similarity chosen – e.g. jaccard similarity must be between 0 and 1.

If add_reverse is true, then all (i,j, rating) entries are also interpreted as (j, i, rating). Note that no duplicates can be present.

◆ similarity_name()

virtual std::string turi::sparse_similarity_lookup::similarity_name ( ) const
pure virtual

Returns the name of the similarity this version uses.

Implemented in turi::sparse_sim::sparse_similarity_lookup_impl< SimilarityType >.

◆ train_from_sparse_matrix_sarray()

virtual std::map<std::string, flexible_type> turi::sparse_similarity_lookup::train_from_sparse_matrix_sarray ( size_t  num_items,
const std::shared_ptr< sarray< std::vector< std::pair< size_t, double > > > > &  data 
)
pure virtual

Trains the model from an sarray of vectors of (index, score) pairs. Each row is assumed to be the user, and each index in the score is an item that the user rated. The similarity of a given item to another item is given by treating the user ratings of each item as a sparse vector and then measuring the similarity between them. This calculation is done as efficiently as possible using a combination of nearest neighbors search and lookup tables.

Member Data Documentation

◆ options

std::map<std::string, flexible_type> turi::sparse_similarity_lookup::options
protected

The stored options.

Definition at line 180 of file sparse_similarity_lookup.hpp.


The documentation for this class was generated from the following file: