Turi Create  4.0
turi::sketches::countmin< T > Class Template Reference

#include <ml/sketches/countmin.hpp>

Public Member Functions

 countmin (size_t bits=16, size_t depth=4, size_t seed=1000)
 Buckets. More...
 
void add (const T &t, size_t count=1)
 
void atomic_add (const T &t, size_t count=1)
 
size_t estimate (const T &t) const
 
void combine (const countmin &other)
 
void print () const
 
double density () const
 

Detailed Description

template<typename T>
class turi::sketches::countmin< T >

An implementation of the countmin sketch for estimating the frequency of each item in a stream.

For more information on the details of the sketch: http://dimacs.rutgers.edu/~graham/pubs/papers/cmsoft.pdf The implementation generally follows the pseudocode in Figure 2. The resulting probabilistic data structure

Usage is simple.

countmin<T> cm; // for approximate counting of a stream with type T
// repeatedly call
cm.add(x) // x can be anything that is hashable.
cm.estimate(x) // will return an estimate of the frequency for
// a given element

One can obtain guarantees on the error in answering a query is within a factor of ε with probability δ if one sets the width=ceiling(e / ε) depth=ceiling(log(1/δ) where e is Euler's constant.

Definition at line 44 of file countmin.hpp.

Constructor & Destructor Documentation

◆ countmin()

template<typename T >
turi::sketches::countmin< T >::countmin ( size_t  bits = 16,
size_t  depth = 4,
size_t  seed = 1000 
)
inlineexplicit

Buckets.

Constructs a countmin sketch having "width" 2^bits and "depth". The size of the matrix of counts will be depth x 2^bits.

Parameters
bitsThe number of bins will be 2^bits.
depthThe "depth" of the sketch is the number of hash functions that will be used on each item.

Definition at line 64 of file countmin.hpp.

Member Function Documentation

◆ add()

template<typename T >
void turi::sketches::countmin< T >::add ( const T &  t,
size_t  count = 1 
)
inline

Adds an arbitrary object to be counted. Any object type can be used, and there are no restrictions as long as std::hash<T> can be used to obtain a hash value.

Definition at line 82 of file countmin.hpp.

◆ atomic_add()

template<typename T >
void turi::sketches::countmin< T >::atomic_add ( const T &  t,
size_t  count = 1 
)
inline

Adds an arbitrary object to be counted. Any object type can be used, and there are no restrictions as long as std::hash<T> can be used to obtain a hash value.

Definition at line 96 of file countmin.hpp.

◆ combine()

template<typename T >
void turi::sketches::countmin< T >::combine ( const countmin< T > &  other)
inline

Merge two countmin datastructures. The two countmin objects must have the same width and depth.

Definition at line 125 of file countmin.hpp.

◆ density()

template<typename T >
double turi::sketches::countmin< T >::density ( ) const
inline

Computes the density of the internal counts matrix.

Definition at line 154 of file countmin.hpp.

◆ estimate()

template<typename T >
size_t turi::sketches::countmin< T >::estimate ( const T &  t) const
inline

Returns the estimate of the frequency for a given object.

Definition at line 107 of file countmin.hpp.

◆ print()

template<typename T >
void turi::sketches::countmin< T >::print ( ) const
inline

Prints the internal matrix containing the current counts.

Definition at line 141 of file countmin.hpp.


The documentation for this class was generated from the following file: