Turi Create
4.0
|
#include <toolkits/pattern_mining/fp_results_tree.hpp>
Public Member Functions | |
void | save (oarchive &oarc) const |
void | load (iarchive &iarc) |
void | add_itemset (const std::vector< size_t > &potential_itemset, const size_t &support) |
size_t | get_min_support_bound () |
void | insert_support (const size_t &support) |
gl_sframe | get_closed_itemsets (const std::shared_ptr< topk_indexer > &indexer=nullptr) const |
bool | is_itemset_redundant (const std::vector< size_t > &potential_itemset, const size_t &support) const |
void | build_tree (const gl_sframe &closed_itemsets) |
std::vector< dense_bitset > | get_top_k_closed_bitsets (const size_t &size, const size_t &top_k=TOP_K_MAX, const size_t &min_length=1) const |
gl_sframe | get_top_k_closed_itemsets (const size_t &top_k=TOP_K_MAX, const size_t &min_length=1, const std::shared_ptr< topk_indexer > &indexer=nullptr) const |
std::vector< size_t > | sort_itemset (const std::vector< size_t > &itemset) const |
size_t | get_support (const std::vector< size_t > &sorted_itemset, const size_t &lower_bound_on_support=0) const |
size_t | get_num_transactions () const |
void | prune_tree (const size_t &min_support) |
Tree data structure for keeping track of the top_k frequent 'closed' itemsets of length at least min_length This is a compressed, memory efficient data structure used to store and during the mining of 'closed' itemsets.
Extends the fp_results_tree class Additional Vars: top_k (size_t) - maximum number of closed itemsets to mine top_k should be less than TOP_K_MAX min_length (size_t) - minimum length of a closed itemset min_support_heap (min_heap) - min heap of top_k supports
Definition at line 244 of file fp_results_tree.hpp.
|
virtual |
Add a potential closed itemset to the tree
Args: potential_itemset (vector of size_ts) - itemset to add support (size_t) - support of the itemset
Reimplemented from turi::pattern_mining::fp_results_tree.
|
inherited |
Build results tree from a collection of closed itemsets
Args: closed_itemsets (gl_sframe) - sframe of closed itemset (ids), support
gl_sframe turi::pattern_mining::fp_top_k_results_tree::get_closed_itemsets | ( | const std::shared_ptr< topk_indexer > & | indexer = nullptr | ) | const |
Overrides fp_result_tree::get_closed_itemsets Return the current collection of closed itemsets Args: indexer: An indexer to convert from id to the value.
Returns: closed_itemsets (gl_sframe) - sframe of closed itemset (ids), support
size_t turi::pattern_mining::fp_top_k_results_tree::get_min_support_bound | ( | ) |
Get a bound for the min_support.
Returns: A current estimate of min support.
|
inlineinherited |
Get the number of transaction
Definition at line 191 of file fp_results_tree.hpp.
|
inherited |
Get the support for a frequent itemset Note: the support for the empty set is the total number of transactions
Args: sorted_itemset (vector of size_t) - a sorted frequent itemset (not necessarily closed) (see sort_itemset). lower_bound_on_support (size_t) - optional, for efficiency Returns: support (size_t)
|
inherited |
Return the current collection of closed itemsets as bitsets.
Args: size: Size of each bitset returned. top_k (size_t) - maximum number of closed itemsets to returns min_length (size_t) - minimum required length for a closed itemset
Returns: closed_bitsets - Vector of bitsets representing each set.
|
inherited |
Return the top_k closed itemsets in descending order.
Takes longer than get_closed_itemsets() due to sorting, but is faster than trying to sort a gl_sframe.
Args: top_k (size_t) - maximum number of closed itemsets to returns min_length (size_t) - minimum required length for a closed itemset
Returns: top_k_closed_itemsets (gl_sframe) - sframe of top_k itemsets, support
void turi::pattern_mining::fp_top_k_results_tree::insert_support | ( | const size_t & | support | ) |
Insert support into the min_support heap.
Args: support (size_t) - support of the itemset
|
inherited |
Check if frequent itemset cannot be closed
Checks if the itemset is a subset of an existing itemset in the tree with equal support. Returns true if a subset with equal support exists.
Args: potential_itemset (vector of size_t) - itemset to check Returns: is_redundant (bool) - whether the itemset passes the test
|
inline |
Load the fp_results_tree from an iarc.
Definition at line 276 of file fp_results_tree.hpp.
|
inherited |
Prune Tree Pass over the tree, removing all nodes (corresponding to itemsets) with support strictly less than the given (final) min support Rebuilds tree -> Should be rarely called (ideally only at the end)
Args: min_support (size_t)
|
inline |
Save the fp_results_tree into a oarc.
Definition at line 259 of file fp_results_tree.hpp.
|
inherited |
Sort itemset by id_order_map (last element first) Args: itemset (vector of size_ts) Returns: sorted_itemset (vector of size_ts)