turicreate.nearest_neighbor_classifier.NearestNeighborClassifier.evaluate

NearestNeighborClassifier.evaluate(dataset, metric='auto', max_neighbors=10, radius=None)

Evaluate the model’s predictive accuracy. This is done by predicting the target class for instances in a new dataset and comparing to known target values.

Parameters:
dataset : SFrame

Dataset of new observations. Must include columns with the same names as the target and features used for model training. Additional columns are ignored.

metric : str, optional

Name of the evaluation metric. Possible values are:

  • ‘auto’: Returns all available metrics.
  • ‘accuracy’: Classification accuracy.
  • ‘confusion_matrix’: An SFrame with counts of possible prediction/true label combinations.
  • ‘roc_curve’: An SFrame containing information needed for an roc curve (binary classification only).
max_neighbors : int, optional

Maximum number of neighbors to consider for each point.

radius : float, optional

Maximum distance from each point to a neighbor in the reference dataset.

Returns:
out : dict

Evaluation results. The dictionary keys are accuracy and confusion_matrix and roc_curve (if applicable).

Notes

  • Because the model randomly breaks ties between predicted classes, the results of repeated calls to evaluate method may differ.

Examples

>>> sf_train = turicreate.SFrame({'species': ['cat', 'dog', 'fossa', 'dog'],
...                             'height': [9, 25, 20, 23],
...                             'weight': [13, 28, 33, 22]})
>>> m = turicreate.nearest_neighbor_classifier.create(sf, target='species')
>>> ans = m.evaluate(sf_train, max_neighbors=2,
...                  metric='confusion_matrix')
>>> print ans['confusion_matrix']
+--------------+-----------------+-------+
| target_label | predicted_label | count |
+--------------+-----------------+-------+
|     cat      |       dog       |   1   |
|     dog      |       dog       |   2   |
|    fossa     |       dog       |   1   |
+--------------+-----------------+-------+