Search results
Results from the WOW.Com Content Network
In mathematical statistics, the Kullback–Leibler (KL) divergence (also called relative entropy and I-divergence [1]), denoted (), is a type of statistical distance: a measure of how much a model probability distribution Q is different from a true probability distribution P.
It can also be understood to be the infinitesimal form of the relative entropy (i.e., the Kullback–Leibler divergence); specifically, it is the Hessian of the divergence. Alternately, it can be understood as the metric induced by the flat space Euclidean metric, after appropriate changes of variable.
where is the Kullback–Leibler divergence, and is the outer product distribution which assigns probability () to each (,).. Notice, as per property of the Kullback–Leibler divergence, that (;) is equal to zero precisely when the joint distribution coincides with the product of the marginals, i.e. when and are independent (and hence observing tells you nothing about ).
The total variation distance is related to the Kullback–Leibler divergence by Pinsker’s inequality: (,) ().One also has the following inequality, due to Bretagnolle and Huber [2] (see also [3]), which has the advantage of providing a non-vacuous bound even when () >:
Kullback–Leibler divergence See § Kullback–Leibler divergence In probability theory and statistics , the multivariate normal distribution , multivariate Gaussian distribution , or joint normal distribution is a generalization of the one-dimensional ( univariate ) normal distribution to higher dimensions .
Thus (; |) is the expected (with respect to ) Kullback–Leibler divergence from the conditional joint distribution (,) | to the product of the conditional marginals | and |. Compare with the definition of mutual information.
As well as the absolute Rényi entropies, Rényi also defined a spectrum of divergence measures generalising the Kullback–Leibler divergence. [ 13 ] The Rényi divergence of order α {\displaystyle \alpha } or alpha-divergence of a distribution P from a distribution Q is defined to be
By Stirling's formula, at the limit of ,,...,, we have (,,) + = = (^ ‖) = (^) + where relative frequencies ^ = / in the data can be interpreted as probabilities from the empirical distribution ^, and is the Kullback–Leibler divergence. This formula can be interpreted as follows.