Turi Create  4.0
turi::nearest_neighbors::lsh_neighbors Class Reference

#include <toolkits/nearest_neighbors/lsh_neighbors.hpp>

Public Member Functions

 ~lsh_neighbors ()
 
void init_options (const std::map< std::string, flexible_type > &_opts) override
 
void train (const sframe &X, const std::vector< flexible_type > &ref_labels, const std::vector< dist_component_type > &composite_distance_params, const std::map< std::string, flexible_type > &opts) override
 
sframe query (const v2::ml_data &mld_queries, const std::vector< flexible_type > &query_labels, const size_t k, const double radius, const bool include_self_edges) const override
 
size_t get_version () const override
 
void save_impl (turi::oarchive &oarc) const override
 
void load_version (turi::iarchive &iarc, size_t version) override
 

Detailed Description

LSH nearest neighbor class.

The intuition behind LSH-based indexes is to hash data points into buckets, such that similar points are more likely to be hashed to the same bucket than dissimilar ones. We could then find the approximate nearest neighbors of any point, simply by finding the bucket that it is hashed to.

It works as follows:

  1. Choose k hash functions h_1, h_2, ..., h_k from a uniform of some family of LSH functions. For any data point v, place v in the bucket with key g(v) = (h_1(v), h_2(v), ..., h_k(v)).
  2. Independently perform step 1 l times to construct l separate hash tables, with hash functions g_1, g_2, ..., g_l

You can set k and l by setting num_projections_per_table and num_tables respectively.

Definition at line 38 of file lsh_neighbors.hpp.

Constructor & Destructor Documentation

◆ ~lsh_neighbors()

turi::nearest_neighbors::lsh_neighbors::~lsh_neighbors ( )

Destructor. Make sure bad things don't happen

Member Function Documentation

◆ get_version()

size_t turi::nearest_neighbors::lsh_neighbors::get_version ( ) const
inlineoverride

Gets the model version number

Definition at line 98 of file lsh_neighbors.hpp.

◆ init_options()

void turi::nearest_neighbors::lsh_neighbors::init_options ( const std::map< std::string, flexible_type > &  _opts)
override

Set the model options. Use the option manager to set these options. The option manager should throw errors if the options do not satisfy the option manager's conditions.

Parameters
[in]optsOptions to set

◆ load_version()

void turi::nearest_neighbors::lsh_neighbors::load_version ( turi::iarchive iarc,
size_t  version 
)
override

Turi serialization save

◆ query()

sframe turi::nearest_neighbors::lsh_neighbors::query ( const v2::ml_data &  mld_queries,
const std::vector< flexible_type > &  query_labels,
const size_t  k,
const double  radius,
const bool  include_self_edges 
) const
override

Find neighbors of queries in a created LSH model.

For each query, the method keeps track of the current k-nearest neighbors in the LSH. At each node, the closest possible point in each child node to the query is computed, and if this distance is further than the current k'th nearest neighbor, that child node (and its descendants) is skpped in the traversal.

Parameters
[in]mld_queriesquery data
[in]query_labelssframe query labels
[in]ksize_t max number of neighbors to return for each query
[in]radiusdouble max distance for returned neighbors to each query
[out]retsframe SFrame with four columns: query label, reference label, distance, and rank.
Note
Assumes that data is already in the right shape.

◆ save_impl()

void turi::nearest_neighbors::lsh_neighbors::save_impl ( turi::oarchive oarc) const
override

Turi serialization save

◆ train()

void turi::nearest_neighbors::lsh_neighbors::train ( const sframe X,
const std::vector< flexible_type > &  ref_labels,
const std::vector< dist_component_type > &  composite_distance_params,
const std::map< std::string, flexible_type > &  opts 
)
override

Create a LSH nearest neighbors model.

Parameters
[in]Xsframe input feature data
[in]ref_labelsrow labels for the reference dataset
[in]composite_distance_params
[in]ysframe input labels

The documentation for this class was generated from the following file: