[011] Exchangeability


In many applications, the order of observations does not matter. (For example, training instances can be permuted without changing the probability distribution; words in the bag-of-words model can be permuted)


Definition Random variables $(y_1, \dots, y_N)$ are exchangeable if for all permutations $\tau$, $$\prob{y_1, \dots, y_N} = \prob{y_{\tau(1)}, \dots, y_{\tau(N)}}$$

Definition Sequence $\{y_i\}_\infty$ is infinitely exchangeable if any finite subsequence is exchangeable.

De Finetti's Theorem

If $y_1, y_2, \dots$ are exchangeable, then the probability distribution $\prob{y_1, y_2, \dots}$ can be described succinctly due to the following theorem.

Theorem (De Finetti) If $\{y_i\}_\infty$ is infinitely exchangeable where $y_i \in Y$, then there exists some parameter space $\Phi$ and density function $\prob{\phi}$ such that for any $N$ observations, $$\prob{y_1, \dots, y_N} = \int_\Phi \prob{\phi}\prod_{i=1}^N\prob{y_i\mid\phi}\,d\phi$$

In other words, we can always draw a graphical model:


If $Y$ is finite, then $\Phi$ has finite dimension, and we can parameterize. (For example, if $Y$ is Bernoulli, then we only need 1 parameter $\phi\in[0,1]$).

However, if $Y$ is infinite, $\Phi$ has infinite dimension and cannot be parameterized by a finite amount of parameters.

This is the motivation for using nonparametric methods which do not assume the (finite) number of parameters.

Exported: 2016-07-13T01:46:24.022927