[022] DP: Stick-Breaking Process Viewpoint

Stick-Breaking Process

Stick-Breaking Process is a method for sampling a distribution $G$ from $\Mr{DP}(\alpha, H)$.

Definition Given $H$ and $\alpha > 0$, sample $\pi = (\pi_1, \pi_2, \dots)$ as follows

  • $\beta_k \sim \Mr{Beta}(1,\alpha)$ (Beta distribution = Dirichlet with 2 variables)
  • $\pi_k = (1-\beta_1)\dots(1-\beta_{k-1})\beta_k = \nail{1 - \pi_1 - \dots - \pi_{k-1}}\beta_k$

We write $\pi \sim \Mr{GEM}(\alpha)$ where GEM stands for Griffiths, Engen and McCloskey. Graphically,

$\beta_1$
:
$1 - \beta_1$
$\beta_2$
:
$1 - \beta_2$
$\beta_3$
:
$1 - \beta_3$
$\vdots$
$\pi_1$
$\pi_2$
$\pi_3$
$\pi_4$
$\cdots$

Algorithm To sample $G$ from $\Mr{DP}(\alpha, H)$,

  • Sample $\pi \sim \Mr{GEM}(\alpha)$
  • Sample $\theta_j \sim H(\lambda)$ for all $j = 1, 2, \dots$
  • Let $$G = \sum_{j=1}^\infty \pi_j\delta_{\theta_j}$$

Theorem The $G$ above satisfies $G\sim \Mr{DP}(\alpha, H)$. Conversely, samples from $\Mr{DP}(\alpha, H)$ will have a representation as described above.

Predictive Distribution

Since the distribution $G \sim \Mr{DP}(\alpha, H)$ looks like

$\pi_1$
$\pi_2$
$\pi_3$
$\pi_4$
$\cdots$

where $\pi_j = \prob{\text{drawing }\theta_j\text{ from }G}$, there is a decent chance to draw the same $\theta$ repeatedly. Indeed, suppose

  • $\bar{\theta}_1, \dots, \bar{\theta}_N$ are independent samples from $G$
  • $\theta_1, \dots, \theta_K$ are the distinct observations among those $N$ samples ($K \leq N$)

If we consider the posterior $$\ex{G(A)\midd \bar{\theta}_1, \dots, \bar{\theta}_N} = \frac{1}{\alpha + N}\nail{\alpha H(A) + \sum_{i=1}^N\delta_{\bar\theta_i}(A)}$$ and shrink $A$ to a single point $\theta$, we get $$\begin{align*} \prob{\bar{\theta}_{N+1} = \theta\midd \bar{\theta}_1, \dots, \bar{\theta}_N} &= \frac{1}{\alpha + N}\nail{\alpha h(\theta) + \sum_{i=1}^N\delta(\bar\theta_i, \theta)} \\ &= \frac{1}{\alpha + N}\nail{\alpha h(\theta) + \sum_{k}N_k\delta(\theta_k, \theta)} \end{align*}$$ where $N_k$ is the number of times we saw $\theta_k$. Therefore, $$\begin{align*} \prob{\bar{\theta}_{N+1} = \text{seen }\theta_k\midd \text{previous draws}} = \frac{N_k}{\alpha + N} \\ \prob{\bar{\theta}_{N+1} = \text{unseen value}\midd \text{previous draws}} = \frac{\alpha}{\alpha + N} \\ \end{align*}$$

Exported: 2016-07-13T01:46:47.780391