ObjectDetector.evaluate(self, dataset, metric='auto', output_type='dict', confidence_threshold=0.001, iou_threshold=0.45)

Evaluate the model by making predictions and comparing these to ground truth bounding box annotations.

dataset : SFrame

Dataset of new observations. Must include columns with the same names as the annotations and feature used for model training. Additional columns are ignored.

metric : str or list, optional

Name of the evaluation metric or list of several names. The primary metric is average precision, which is the area under the precision/recall curve and reported as a value between 0 and 1 (1 being perfect). Possible values are:

  • ‘auto’ : Returns all primary metrics.

  • ‘all’ : Returns all available metrics.

  • ‘average_precision_50’ : Average precision per class with

    intersection-over-union threshold at 50% (PASCAL VOC metric).

  • ‘average_precision’ : Average precision per class calculated over multiple

    intersection-over-union thresholds (at 50%, 55%, …, 95%) and averaged.

  • ‘mean_average_precision_50’ : Mean over all classes (for 'average_precision_50').

    This is the primary single-value metric.

  • ‘mean_average_precision’ : Mean over all classes (for 'average_precision')

out : dict / SFrame

Output type depends on the option output_type.

See also

create, predict


>>> results = model.evaluate(data)
>>> print('mAP: {:.1%}'.format(results['mean_average_precision']))
mAP: 43.2%