turicreate.recommender.factorization_recommender.FactorizationRecommender.evaluate¶
-
FactorizationRecommender.
evaluate
(dataset, metric='auto', exclude_known_for_precision_recall=True, target=None, verbose=True, **kwargs)¶ Evaluate the model’s ability to make rating predictions or recommendations.
If the model is trained to predict a particular target, the default metric used for model comparison is root-mean-squared error (RMSE). Suppose \(y\) and \(\widehat{y}\) are vectors of length \(N\), where \(y\) contains the actual ratings and \(\widehat{y}\) the predicted ratings. Then the RMSE is defined as
\[RMSE = \sqrt{\frac{1}{N} \sum_{i=1}^N (\widehat{y}_i - y_i)^2} .\]If the model was not trained on a target column, the default metrics for model comparison are precision and recall. Let \(p_k\) be a vector of the \(k\) highest ranked recommendations for a particular user, and let \(a\) be the set of items for that user in the groundtruth dataset. The “precision at cutoff k” is defined as
\[P(k) = \frac{ | a \cap p_k | }{k}\]while “recall at cutoff k” is defined as
\[R(k) = \frac{ | a \cap p_k | }{|a|}\]Parameters: - dataset : SFrame
An SFrame that is in the same format as provided for training.
- metric : str, {‘auto’, ‘rmse’, ‘precision_recall’}, optional
Metric to use for evaluation. The default automatically chooses ‘rmse’ for models trained with a target, and ‘precision_recall’ otherwise.
- exclude_known_for_precision_recall : bool, optional
A useful option for evaluating precision-recall. Recommender models have the option to exclude items seen in the training data from the final recommendation list. Set this option to True when evaluating on test data, and False when evaluating precision-recall on training data.
- target : str, optional
The name of the target column for evaluating rmse. If the model is trained with a target column, the default is to using the same column. If the model is trained without a target column and metric is set to ‘rmse’, this option must provided by user.
- verbose : bool, optional
Enables verbose output. Default is verbose.
- **kwargs
When metric is set to ‘precision_recall’, these parameters are passed on to
evaluate_precision_recall()
.
Returns: - out : SFrame or dict
Results from the model evaluation procedure. If the model is trained on a target (i.e. RMSE is the evaluation criterion), a dictionary with three items is returned: items rmse_by_user and rmse_by_item are SFrames with per-user and per-item RMSE, while rmse_overall is the overall RMSE (a float). If the model is trained without a target (i.e. precision and recall are the evaluation criteria) an
SFrame
is returned with both of these metrics for each user at several cutoff values.
See also
evaluate_precision_recall
,evaluate_rmse
,precision_recall_by_user
Examples
>>> import turicreate as tc >>> sf = tc.SFrame('https://static.turi.com/datasets/audioscrobbler') >>> train, test = tc.recommender.util.random_split_by_user(sf) >>> m = tc.recommender.create(train, target='target') >>> eval = m.evaluate(test)