TopicModel.evaluate(train_data, test_data=None, metric='perplexity')

Estimate the model’s ability to predict new data. Imagine you have a corpus of books. One common approach to evaluating topic models is to train on the first half of all of the books and see how well the model predicts the second half of each book.

This method returns a metric called perplexity, which is related to the likelihood of observing these words under the given model. See perplexity() for more details.

The provided train_data and test_data must have the same length, i.e., both data sets must have the same number of documents; the model will use train_data to estimate which topic the document belongs to, and this is used to estimate the model’s performance at predicting the unseen words in the test data.

See predict() for details on how these predictions are made, and see random_split() for a helper function that can be used for making train/test splits.

train_data : SArray or SFrame

A set of documents to predict topics for.

test_data : SArray or SFrame, optional

A set of documents to evaluate performance on. By default this will set to be the same as train_data.

metric : str

The chosen metric to use for evaluating the topic model. Currently only ‘perplexity’ is supported.

out : dict

The set of estimated evaluation metrics.

See also

predict, turicreate.toolkits.text_analytics.random_split


>>> docs = turicreate.SArray('https://static.turi.com/datasets/nips-text')
>>> train_data, test_data = turicreate.text_analytics.random_split(docs)
>>> m = turicreate.topic_model.create(train_data)
>>> m.evaluate(train_data, test_data)
{'perplexity': 2467.530370396021}