turicreate.evaluation.log_loss

turicreate.evaluation.log_loss(targets, predictions, index_map=None)

Compute the logloss for the given targets and the given predicted probabilities. This quantity is defined to be the negative of the sum of the log probability of each observation, normalized by the number of observations:

\[\textrm{logloss} = - \frac{1}{N} \sum_{i \in 1,\ldots,N} (y_i \log(p_i) + (1-y_i)\log(1-p_i)) ,\]

where y_i is the i’th target value and p_i is the i’th predicted probability.

For multiclass situations, the definition is a slight generalization of the above:

\[\textrm{logloss} = - \frac{1}{N} \sum_{i \in 1,\ldots,N} \sum_{j \in 1, \ldots, L} (y_{ij} \log(p_{ij})) ,\]

where \(L\) is the number of classes and \(y_{ij}\) indicates that observation i has class label j.

Parameters:
targets : SArray

Ground truth class labels. This can either contain integers or strings.

predictions : SArray

The predicted probability that corresponds to each target value. For binary classification, the probability corresponds to the probability of the “positive” label being predicted. For multi-class classification, the predictions are expected to be an array of predictions for each class.

index_map : dict[int], [None (default)]

For binary classification, a dictionary mapping the two target labels to either 0 (negative) or 1 (positive). For multi-class classification, a dictionary mapping potential target labels to the associated index into the vectors in predictions.

Returns:
out : float

The log_loss.

See also

accuracy

Notes

  • For binary classification, when the target label is of type “string”, then the labels are sorted alphanumerically and the largest label is chosen as the “positive” label. For example, if the classifier labels are {“cat”, “dog”}, then “dog” is chosen as the positive label for the binary classification case. This behavior can be overridden by providing an explicit index_map.
  • For multi-class classification, when the target label is of type “string”, then the probability vector is assumed to be a vector of probabilities of classes as sorted alphanumerically. Hence, for the probability vector [0.1, 0.2, 0.7] for a dataset with classes “cat”, “dog”, and “rat”; the 0.1 corresponds to “cat”, the 0.2 to “dog” and the 0.7 to “rat”. This behavior can be overridden by providing an explicit index_map.
  • Logloss is undefined when a probability value p = 0, or p = 1. Hence, probabilities are clipped to max(EPSILON, min(1 - EPSILON, p)) where EPSILON = 1e-15.

References

https://www.kaggle.com/wiki/LogLoss

Examples

import turicreate as tc
targets = tc.SArray([0, 1, 1, 0])
predictions = tc.SArray([0.1, 0.35, 0.7, 0.99])
log_loss = tc.evaluation.log_loss(targets, predictions)

For binary classification, when the target label is of type “string”, then the labels are sorted alphanumerically and the largest label is chosen as the “positive” label.

import turicreate as tc
targets = tc.SArray(["cat", "dog", "dog", "cat"])
predictions = tc.SArray([0.1, 0.35, 0.7, 0.99])
log_loss = tc.evaluation.log_loss(targets, predictions)

In the multi-class setting, log-loss requires a vector of probabilities (that sum to 1) for each class label in the input dataset. In this example, there are three classes [0, 1, 2], and the vector of probabilities correspond to the probability of prediction for each of the three classes.

target    = tc.SArray([ 1, 0, 2, 1])
predictions = tc.SArray([[.1, .8, 0.1],
                        [.9, .1, 0.0],
                        [.8, .1, 0.1],
                        [.3, .6, 0.1]])
log_loss = tc.evaluation.log_loss(targets, predictions)

For multi-class classification, when the target label is of type “string”, then the probability vector is assumed to be a vector of probabilities of class as sorted alphanumerically. Hence, for the probability vector [0.1, 0.2, 0.7] for a dataset with classes “cat”, “dog”, and “rat”; the 0.1 corresponds to “cat”, the 0.2 to “dog” and the 0.7 to “rat”.

target    = tc.SArray([ "dog", "cat", "foosa", "dog"])
predictions = tc.SArray([[.1, .8, 0.1],
                        [.9, .1, 0.0],
                        [.8, .1, 0.1],
                        [.3, .6, 0.1]])
log_loss = tc.evaluation.log_loss(targets, predictions)

If the probability vectors contain predictions for labels not present among the targets, an explicit index map must be provided.

target    = tc.SArray([ "dog", "cat", "cat", "dog"])
predictions = tc.SArray([[.1, .8, 0.1],
                        [.9, .1, 0.0],
                        [.8, .1, 0.1],
                        [.3, .6, 0.1]])
index_map = {"cat": 0, "dog": 1, "foosa": 2}
log_loss = tc.evaluation.log_loss(targets, predictions, index_map=index_map)