turicreate.evaluation.auc

turicreate.evaluation.auc(targets, predictions, average='macro', index_map=None)

Compute the area under the ROC curve for the given targets and predictions.

Parameters:
targets : SArray

An SArray containing the observed values. For binary classification, the alpha-numerically first category is considered the reference category.

predictions : SArray

Prediction probability that corresponds to each target value. This must be of same length as targets.

average : string, [None, ‘macro’ (default)]

Metric averaging strategies for multiclass classification. Averaging strategies can be one of the following:

  • None: No averaging is performed and a single metric is returned for each class.
  • ‘macro’: Calculate metrics for each label, and find their unweighted mean. This does not take label imbalance into account.
index_map : dict[int], [None (default)]

For binary classification, a dictionary mapping the two target labels to either 0 (negative) or 1 (positive). For multi-class classification, a dictionary mapping potential target labels to the associated index into the vectors in predictions.

Returns:
out : float (for binary classification) or dict[float]

Score for the positive class (for binary classification) or an average score for each class for multi-class classification. If average=None, then a dictionary is returned where the key is the class label and the value is the score for the corresponding class label.

Examples

>>> targets = turicreate.SArray([0, 1, 1, 0])
>>> predictions = turicreate.SArray([0.1, 0.35, 0.7, 0.99])

# Calculate the auc-score
>>> auc =  turicreate.evaluation.auc(targets, predictions)
0.5

This metric also works when the targets are strings (Here “cat” is chosen as the reference class).

>>> targets = turicreate.SArray(["cat", "dog", "dog", "cat"])
>>> predictions = turicreate.SArray([0.1, 0.35, 0.7, 0.99])

# Calculate the auc-score
>>> auc =  turicreate.evaluation.auc(targets, predictions)
0.5

For the multi-class setting, the auc-score can be averaged.

# Targets and Predictions
>>> targets     = turicreate.SArray([ 1, 0, 2, 1])
>>> predictions = turicreate.SArray([[.1, .8, 0.1],
...                                [.9, .1, 0.0],
...                                [.8, .1, 0.1],
...                                [.3, .6, 0.1]])

#  Macro average of the scores for each class.
>>> turicreate.evaluation.auc(targets, predictions, average = 'macro')
0.8888888888888888

# Scores for each class.
>>> turicreate.evaluation.auc(targets, predictions, average = None)
{0: 1.0, 1: 1.0, 2: 0.6666666666666666}

This metric also works for “string” targets in the multi-class setting

# Targets and Predictions
>>> targets     = turicreate.SArray([ "dog", "cat", "foosa", "dog"])
>>> predictions = turicreate.SArray([[.1, .8, 0.1],
                                   [.9, .1, 0.0],
                                   [.8, .1, 0.1],
                                   [.3, .6, 0.1]])

# Macro average.
>>> auc =  turicreate.evaluation.auc(targets, predictions)
0.8888888888888888

# Score for each class.
>>> auc =  turicreate.evaluation.auc(targets, predictions, average=None)
{'cat': 1.0, 'dog': 1.0, 'foosa': 0.6666666666666666}