turicreate.recommender.item_similarity_recommender.ItemSimilarityRecommender

class turicreate.recommender.item_similarity_recommender.ItemSimilarityRecommender(model_proxy)

A model that ranks an item according to its similarity to other items observed for the user in question.

Creating an ItemSimilarityRecommender

This model cannot be constructed directly. Instead, use turicreate.recommender.item_similarity_recommender.create() to create an instance of this model. A detailed list of parameter options and code samples are available in the documentation for the create function.

See also

create

Notes

Model Definition

This model first computes the similarity between items using the observations of users who have interacted with both items. Given a similarity between item \(i\) and \(j\), \(S(i,j)\), it scores an item \(j\) for user \(u\) using a weighted average of the user’s previous observations \(I_u\).

There are three choices of similarity metrics to use: ‘jaccard’, ‘cosine’ and ‘pearson’.

Jaccard similarity is used to measure the similarity between two set of elements. In the context of recommendation, the Jaccard similarity between two items is computed as

\[\mbox{JS}(i,j) = \frac{|U_i \cap U_j|}{|U_i \cup U_j|}\]

where \(U_{i}\) is the set of users who rated item \(i\). Jaccard is a good choice when one only has implicit feedbacks of items (e.g., people rated them or not), or when one does not care about how many stars items received.

If one needs to compare the ratings of items, Cosine and Pearson similarity are recommended.

The Cosine similarity between two items is computed as

\[\mbox{CS}(i,j) = \frac{\sum_{u\in U_{ij}} r_{ui}r_{uj}} {\sqrt{\sum_{u\in U_{i}} r_{ui}^2} \sqrt{\sum_{u\in U_{j}} r_{uj}^2}}\]

where \(U_{i}\) is the set of users who rated item \(i\), and \(U_{ij}\) is the set of users who rated both items \(i\) and \(j\). A problem with Cosine similarity is that it does not consider the differences in the mean and variance of the ratings made to items \(i\) and \(j\).

Another popular measure that compares ratings where the effects of means and variance have been removed is Pearson Correlation similarity:

\[\mbox{PS}(i,j) = \frac{\sum_{u\in U_{ij}} (r_{ui} - \bar{r}_i) (r_{uj} - \bar{r}_j)} {\sqrt{\sum_{u\in U_{ij}} (r_{ui} - \bar{r}_i)^2} \sqrt{\sum_{u\in U_{ij}} (r_{uj} - \bar{r}_j)^2}}\]

The predictions of items depend on whether target is specified. When the target is absent, a prediction for item \(j\) is made via

\[y_{uj} = \frac{\sum_{i \in I_u} \mbox{SIM}(i,j) }{|I_u|}\]

Otherwise, predictions for jaccard and cosine similarities are made via

\[y_{uj} = \frac{\sum_{i \in I_u} \mbox{SIM}(i,j) r_{ui} }{\sum_{i \in I_u} \mbox{SIM}(i,j)}\]

Predictions for pearson similarity are made via

\[y_{uj} = \bar{r}_j + \frac{\sum_{i \in I_u} \mbox{SIM}(i,j) (r_{ui} - \bar{r}_i) }{\sum_{i \in I_u} \mbox{SIM}(i,j)}\]

For more details of item similarity methods, please see, e.g., Chapter 4 of [Ricci_et_al].

References

[Ricci_et_al]Francesco Ricci, Lior Rokach, and Bracha Shapira. Introduction to recommender systems handbook. Springer US, 2011.

Methods

ItemSimilarityRecommender.evaluate(dataset) Evaluate the model’s ability to make rating predictions or recommendations.
ItemSimilarityRecommender.evaluate_precision_recall(dataset) Compute a model’s precision and recall scores for a particular dataset.
ItemSimilarityRecommender.evaluate_rmse(…) Evaluate the prediction error for each user-item pair in the given data set.
ItemSimilarityRecommender.export_coreml(filename) Export the model in Core ML format.
ItemSimilarityRecommender.get_num_items_per_user() Get the number of items observed for each user.
ItemSimilarityRecommender.get_num_users_per_item() Get the number of users observed for each item.
ItemSimilarityRecommender.get_similar_items([…]) Get the k most similar items for each item in items.
ItemSimilarityRecommender.get_similar_users([…]) Get the k most similar users for each entry in users.
ItemSimilarityRecommender.predict(dataset[, …]) Return a score prediction for the user ids and item ids in the provided data set.
ItemSimilarityRecommender.recommend([users, …]) Recommend the k highest scored items for each user.
ItemSimilarityRecommender.recommend_from_interactions(…) Recommend the k highest scored items based on the
ItemSimilarityRecommender.save(location) Save the model.
ItemSimilarityRecommender.summary([output]) Print a summary of the model.