turicreate.nearest_neighbor_classifier.NearestNeighborClassifier.predict_topk¶
-
NearestNeighborClassifier.
predict_topk
(dataset, max_neighbors=10, radius=None, k=3, verbose=False)¶ Return top-k most likely predictions for each observation in
dataset
. Predictions are returned as an SFrame with three columns: row_id, class, and probability.Parameters: - dataset : SFrame
Dataset of new observations. Must include the features used for model training, but does not require a target column. Additional columns are ignored.
- max_neighbors : int, optional
Maximum number of neighbors to consider for each point.
- radius : float, optional
Maximum distance from each point to a neighbor in the reference dataset.
- k : int, optional
Number of classes to return for each input example.
Returns: - out : SFrame
Notes
- If the ‘radius’ parameter is small, it is possible that a query point has no neighbors in the training dataset. In this case, the query is dropped from the SFrame output by this method. If all queries have no neighbors, then the result is an empty SFrame. If the target column in the training dataset has missing values, these predictions will be ambiguous.
- Ties between predicted classes are broken randomly.
Examples
>>> sf_train = turicreate.SFrame({'species': ['cat', 'dog', 'fossa', 'dog'], ... 'height': [9, 25, 20, 23], ... 'weight': [13, 28, 33, 22]}) ... >>> sf_new = turicreate.SFrame({'height': [26, 19], ... 'weight': [25, 35]}) ... >>> m = turicreate.nearest_neighbor_classifier.create(sf_train, target='species') >>> ystar = m.predict_topk(sf_new, max_neighbors=2) >>> print ystar +--------+-------+-------------+ | row_id | class | probability | +--------+-------+-------------+ | 0 | dog | 1.0 | | 1 | fossa | 0.5 | | 1 | dog | 0.5 | +--------+-------+-------------+