turicreate.image_similarity.ImageSimilarityModel.similarity_graph

ImageSimilarityModel.similarity_graph(k=5, radius=None, include_self_edges=False, output_type='SGraph', verbose=True)

Construct the similarity graph on the reference dataset, which is already stored in the model to find the top k similar images for each image in your input dataset.

This is conceptually very similar to running query with the reference set, but this method is optimized for the purpose, syntactically simpler, and automatically removes self-edges.

WARNING: This method can take time.

Parameters:
k : int, optional

Maximum number of neighbors to return for each point in the dataset. Setting this to None deactivates the constraint, so that all neighbors are returned within radius of a given point.

radius : float, optional

For a given point, only neighbors within this distance are returned. The default is None, in which case the k nearest neighbors are returned for each query point, regardless of distance.

include_self_edges : bool, optional

For most distance functions, each point in the model’s reference dataset is its own nearest neighbor. If this parameter is set to False, this result is ignored, and the nearest neighbors are returned excluding the point itself.

output_type : {‘SGraph’, ‘SFrame’}, optional

By default, the results are returned in the form of an SGraph, where each point in the reference dataset is a vertex and an edge A -> B indicates that vertex B is a nearest neighbor of vertex A. If ‘output_type’ is set to ‘SFrame’, the output is in the same form as the results of the ‘query’ method: an SFrame with columns indicating the query label (in this case the query data is the same as the reference data), reference label, distance between the two points, and the rank of the neighbor.

verbose : bool, optional

If True, print progress updates and model details.

Returns:
out : SFrame or SGraph

The type of the output object depends on the ‘output_type’ parameter. See the parameter description for more detail.

Notes

  • If both k and radius are set to None, each data point is matched to the entire dataset. If the reference dataset has \(n\) rows, the output is an SFrame with \(n^2\) rows (or an SGraph with \(n^2\) edges).

Examples

>>> graph = model.similarity_graph(k=1)  # an SGraph
>>>
>>> # Most similar image for each image in the input dataset
>>> graph.edges
+----------+----------+----------------+------+
| __src_id | __dst_id |    distance    | rank |
+----------+----------+----------------+------+
|    0     |    1     | 0.376430604494 |  1   |
|    2     |    1     | 0.55542776308  |  1   |
|    1     |    0     | 0.376430604494 |  1   |
+----------+----------+----------------+------+