Negative Log-Likelihood

Negative log-likelihood, or NLL, is a loss function used in multi-class classification. It measures how closely our model predictions align with the ground truth labels.

It is calculated as log(ˆy), where ˆy is the prediction corresponding to the true class label after the model outputs are converted into probabilities by applying the Softmax Function to them. The loss for a mini-batch is computed by calculating the NLL for each item and then calculating the mean or sum of all items in the batch.

Since a negative value is returned for the log of a number greater than 0 and less than 1, we add a negative sign to convert it to a positive number, hence negative log-likelihood. At 0 the function returns (log(0)=) and at 1 returns 0 (log(1)=0), so very wrong answers are heavily penalised.

Because the Softmax Function tends to force a single significant number, the loss function only needs to be concerned with the loss corresponding to the correct labels.

In PyTorch, the function is called torch.functional.nll_loss, although it doesn't take the log, as it expects outputs from a LogSoftmax activation layer.

Referred to as Log Loss in binary classification problems.

Code example:

Negative Log-Likelihood is the 2nd part of the Categorical Cross-Entropy Loss.


Recommended Reading

Deep Learning for Coders with Fastai and PyTorch: AI Applications Without a PhD

Deep Learning for Coders with fastai & PyTorch

This book is my favourite practical overview of Deep Learning. Learn more about negative log-likelihood in Chapter 6, pg. 231-232.


Backlinks