Cross-entropy measures the average number of bits required to identify an event if you had a coding scheme optimised for one probability distribution qq, where the true probability distribution was actually pp.

It's the same as Information Entropy but measuring what happens if you have are identifying messages using a different probability distribution.

Expressed as: H(p,q)=i=1npi×log2(qi)H(p, q)=-\sum\limits_{i=1}^{n} p_{i} \times log_2(q_{i})

