turicreate.evaluation.recall¶

turicreate.evaluation.recall(targets, predictions, average='macro')¶

Compute the recall score for classification tasks. The recall score quantifies the ability of a classifier to predict positive examples. Recall can be interpreted as the probability that a randomly selected positive example is correctly identified by the classifier. The score is in the range [0,1] with 0 being the worst, and 1 being perfect.

The recall score is defined as the ratio:: $\frac{tp}{tp + fn}$

where tp is the number of true positives and fn the number of false negatives.

Parameters:

targets : SArray

Ground truth class labels. The SArray can be of any type.

predictions : SArray

The prediction that corresponds to each target value. This SArray must have the same length as targets and must be of the same type as the targets SArray.

average : string, [None, ‘macro’ (default), ‘micro’]

Metric averaging strategies for multiclass classification. Averaging strategies can be one of the following:

None: No averaging is performed and a single metric is returned for each class.

‘micro’: Calculate metrics globally by counting the total true positives, false negatives, and false positives.

‘macro’: Calculate metrics for each label and find their unweighted mean. This does not take label imbalance into account.

Returns:

out : float (for binary classification) or dict[float]: Score for the positive class (for binary classification) or an average score for each class for multi-class classification. If average=None, then a dictionary is returned where the key is the class label and the value is the score for the corresponding class label.

See also

confusion_matrix, accuracy, precision, f1_score

Notes

For binary classification, when the target label is of type “string”, then the labels are sorted alphanumerically and the largest label is chosen as the “positive” label. For example, if the classifier labels are {“cat”, “dog”}, then “dog” is chosen as the positive label for the binary classification case.

Examples

# Targets and Predictions
>>> targets = turicreate.SArray([0, 1, 2, 3, 0, 1, 2, 3])
>>> predictions = turicreate.SArray([1, 0, 2, 1, 3, 1, 2, 1])

# Micro average of the recall scores for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = 'micro')
0.375

# Macro average of the recall scores for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = 'macro')
0.375

# Recall score for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = None)
{0: 0.0, 1: 0.5, 2: 1.0, 3: 0.0}

This metric also works for string classes.

# Targets and Predictions
>>> targets = turicreate.SArray(
...      ["cat", "dog", "foosa", "snake", "cat", "dog", "foosa", "snake"])
>>> predictions = turicreate.SArray(
...      ["dog", "cat", "foosa", "dog", "snake", "dog", "cat", "dog"])

# Micro average of the recall scores for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = 'micro')
0.375

# Macro average of the recall scores for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = 'macro')
0.375

# Recall score for each class.
>>> turicreate.evaluation.recall(targets, predictions,
...                            average = None)
{0: 0.0, 1: 0.5, 2: 1.0, 3: 0.0}