Search results
Results from the WOW.Com Content Network
The two most important classes of divergences are the f-divergences and Bregman divergences; however, other types of divergence functions are also encountered in the literature. The only divergence for probabilities over a finite alphabet that is both an f-divergence and a Bregman divergence is the Kullback–Leibler divergence. [8]
In probability theory, an -divergence is a certain type of function (‖) that measures the difference between two probability distributions and . Many common divergences, such as KL-divergence , Hellinger distance , and total variation distance , are special cases of f {\displaystyle f} -divergence.
In probability and statistics, the Hellinger distance (closely related to, although different from, the Bhattacharyya distance) is used to quantify the similarity between two probability distributions. It is a type of f-divergence. The Hellinger distance is defined in terms of the Hellinger integral, which was introduced by Ernst Hellinger in 1909.
The only divergence on that is both a Bregman divergence and an f-divergence is the Kullback–Leibler divergence. [ 6 ] If n ≥ 3 {\displaystyle n\geq 3} , then any Bregman divergence on Γ n {\displaystyle \Gamma _{n}} that satisfies the data processing inequality must be the Kullback–Leibler divergence.
This function is symmetric and nonnegative, and had already been defined and used by Harold Jeffreys in 1948; [7] it is accordingly called the Jeffreys divergence. This quantity has sometimes been used for feature selection in classification problems, where P and Q are the conditional pdfs of a feature under two different classes.
By Chentsov’s theorem, the Fisher information metric on statistical models is the only Riemannian metric (up to rescaling) that is invariant under sufficient statistics. [3] [4] It can also be understood to be the infinitesimal form of the relative entropy (i.e., the Kullback–Leibler divergence); specifically, it is the Hessian of
Quantum Jensen–Shannon divergence for = (,) and two density matrices is a symmetric function, everywhere defined, bounded and equal to zero only if two density matrices are the same. It is a square of a metric for pure states , [ 13 ] and it was recently shown that this metric property holds for mixed states as well.
In the theory of probability, the Glivenko–Cantelli theorem (sometimes referred to as the Fundamental Theorem of Statistics), named after Valery Ivanovich Glivenko and Francesco Paolo Cantelli, describes the asymptotic behaviour of the empirical distribution function as the number of independent and identically distributed observations grows. [1]