turicreate.kmeans.KmeansModel.predict¶
-
KmeansModel.
predict
(dataset, output_type='cluster_id', verbose=True)¶ Return predicted cluster label for instances in the new ‘dataset’. K-means predictions are made by assigning each new instance to the closest cluster center.
Parameters: - dataset : SFrame
Dataset of new observations. Must include the features used for model training; additional columns are ignored.
- output_type : {‘cluster_id’, ‘distance’}, optional
Form of the prediction. ‘cluster_id’ (the default) returns the cluster label assigned to each input instance, while ‘distance’ returns the Euclidean distance between the instance and its assigned cluster’s center.
- verbose : bool, optional
If True, print progress updates to the screen.
Returns: - out : SArray
Model predictions. Depending on the specified output_type, either the assigned cluster label or the distance of each point to its closest cluster center. The order of the predictions is the same as order of the input data rows.
See also
Examples
>>> sf = turicreate.SFrame({ ... 'x1': [0.6777, -9.391, 7.0385, 2.2657, 7.7864, -10.16, -8.162, ... 8.8817, -9.525, -9.153, 2.0860, 7.6619, 6.5511, 2.7020], ... 'x2': [5.6110, 8.5139, 5.3913, 5.4743, 8.3606, 7.8843, 2.7305, ... 5.1679, 6.7231, 3.7051, 1.7682, 7.4608, 3.1270, 6.5624]}) ... >>> model = turicreate.kmeans.create(sf, num_clusters=3) ... >>> sf_new = turicreate.SFrame({'x1': [-5.6584, -1.0167, -9.6181], ... 'x2': [-6.3803, -3.7937, -1.1022]}) >>> clusters = model.predict(sf_new, output_type='cluster_id') >>> print clusters [1, 0, 1]