turicreate.recommender.item_content_recommender.create

turicreate.recommender.item_content_recommender.create(item_data, item_id, observation_data=None, user_id=None, target=None, weights='auto', similarity_metrics='auto', item_data_transform='auto', max_item_neighborhood_size=64, verbose=True)

Create a content-based recommender model in which the similarity between the items recommended is determined by the content of those items rather than learned from user interaction data.

The similarity score between two items is calculated by first computing the similarity between the item data for each column, then taking a weighted average of the per-column similarities to get the final similarity. The recommendations are generated according to the average similarity of a candidate item to all the items in a user’s set of rated items.

Parameters:
item_data : SFrame

An SFrame giving the content of the items to use to learn the structure of similar items. The SFrame must have one column that matches the name of the item_id; this gives a unique identifier that can then be used to make recommendations. The rest of the columns are then used in the distance calculations below.

item_id : string

The name of the column in item_data (and observation_data, if given) that represents the item ID.

observation_data : None (optional)

An SFrame giving user and item interaction data. This information is stored in the model, and the recommender will recommend the items with the most similar content to the items that were present and/or highly rated for that user.

user_id : None (optional)

If observation_data is given, then this specifies the column name corresponding to the user identifier.

target : None (optional)

If observation_data is given, then this specifies the column name corresponding to the target or rating.

weights : dict or ‘auto’ (optional)

If given, then weights must be a dictionary of column names present in item_data to weights between the column names. If ‘auto’ is given, the all columns are weighted equally.

max_item_neighborhood_size : int, 64

For each item, we hold this many similar items to use when aggregating models for predictions. Decreasing this value decreases the memory required by the model and decreases the time required to generate recommendations, but it may also decrease recommendation accuracy.

verbose : True or False (optional)

If set to False, then less information is printed.

Examples

>>> item_data = tc.SFrame({"my_item_id" : range(4),
                           "data_1" : [ [1, 0], [1, 0], [0, 1], [0.5, 0.5] ],
                           "data_2" : [ [0, 1], [1, 0], [0, 1], [0.5, 0.5] ] })
>>> m = tc.recommender.item_content_recommender.create(item_data, "my_item_id")
>>> m.recommend_from_interactions([0])
Columns:
my_item_id int score float rank int

Rows: 3

Data: +————+—————-+——+ | my_item_id | score | rank | +————+—————-+——+ | 3 | 0.707106769085 | 1 | | 1 | 0.5 | 2 | | 2 | 0.5 | 3 | +————+—————-+——+ [3 rows x 3 columns]

>>> m.recommend_from_interactions([0, 1])
Columns:
my_item_id int score float rank int

Rows: 2

Data: +————+—————-+——+ | my_item_id | score | rank | +————+—————-+——+ | 3 | 0.707106769085 | 1 | | 2 | 0.25 | 2 | +————+—————-+——+ [2 rows x 3 columns]