Categorical Cross-Entropy Loss

Jul 29, 2021 permanent MachineLearning LossFunction

Categorical Cross-Entropy Loss Function, also known as Softmax Loss, is a Loss Function used in multiclass classification model training. It applies the Softmax Function to a model's output (logits) before applying the Negative Log-Likelihood function.

Lower loss means closer to the ground truth.

In math, expressed as

$P = softmax (O)$

$- i = 1 \sum N Y_{i} \times lo g (P_{i})$

where $N$ is the number of classes, $Y$ is the ground truth labels, and $O$ is the model outputs. Since $Y$ is one-hot encoded, the labels that don't correspond to the ground truth will be multiplied by 0, so we effectively take the log of only the prediction for the true label.

Cross-entropy loss function

In the PyTorch implementation, the index of the ground truth label is passed instead of one-hot encoded vector.

In [1]:

import torch
from torch import nn, tensor

torch.set_printoptions(sci_mode=False)

dog_class_index = 0

label = tensor([dog_class_index])
logits = tensor([[3.5, -3.45, 0.23]])

nn.CrossEntropyLoss()(logits, label)

Out[1]:

tensor(0.0382)

We can also achieve the same result by manually calling softmax and negative-log loss.

In [2]:

softmax_probs = nn.Softmax(dim=1)(logits)
softmax_probs

Out[2]:

tensor([[    0.9625,     0.0009,     0.0366]])

In [3]:

-torch.log(softmax_probs[:,label])

Out[3]:

tensor([[0.0382]])

Based on Cross-Entropy in Information Theory, which is a measure of difference between 2 probability distributions.

[@howardDeepLearningCoders2020] (pg. 222-231)

Tags

Notes by Lex Toumbourou

Categorical Cross-Entropy Loss

Backlinks