Turi Create
4.0
|
#include <toolkits/util/data_generators.hpp>
Public Member Functions | |
sframe | generate (size_t n_observations, const std::string &target_column_name, size_t random_seed, double noise_sd) const |
std::pair< sframe, sframe > | generate_for_ranking (size_t n_train_samples_per_user, size_t n_test_samples_per_user, size_t random_seed, double noise_sd) const |
A simple class for generating fake linear model data for testing purposes. This uses the factorization machine model to generate the data.
The options going into this generator are as follows. These are not necessarily used by each function:
The defaults for these are given in data_generators.cpp.
Definition at line 43 of file data_generators.hpp.
sframe turi::recsys::lm_data_generator::generate | ( | size_t | n_observations, |
const std::string & | target_column_name, | ||
size_t | random_seed, | ||
double | noise_sd | ||
) | const |
Fill data with the observations and responses of the linear model.
std::pair<sframe, sframe> turi::recsys::lm_data_generator::generate_for_ranking | ( | size_t | n_train_samples_per_user, |
size_t | n_test_samples_per_user, | ||
size_t | random_seed, | ||
double | noise_sd | ||
) | const |
Fill two datasets for ranking and testing the ranking. This works by building a linear model and assuming that the observations with the highest responses are those in the data set. A portion of these are split off into the test set.