I learned a little about Gibbs sampling from the text book. However, I had been confused when I learn Latent Dirichlet Allocation (LDA) and was told LDA can be trained by a technique known as "collapsed" Gibbs sampling. Fortunately,
this post in the well-known Natural Language Processing Blog answered my question concisely:
The standard setup for Gibbs sampling over a space of variables a,b,c (I'll assume there are no exploitable independences) is:
- Draw a conditioned on b,c
- Draw b conditioned on a,c
- Draw c conditioned on a,b
This is quite a simple story that, in some cases, be "improved." For instance, it is often possible to jointly draw a and b, yielding:
- Draw a,b conditioned on c
- Draw c conditioned on a,b
This is the "blocked Gibbs sampler." Another variant, that is commonly used in our community, is when one of the variables (say, b) can be analytically integrated out, yielding:
- Draw a conditioned on c
- Draw c conditioned on a
For LDA, by introducing a Dirichlet prior, beta, for the model parameters (i.e., word distributions of topics), we can integrate out model parameters. Therefore, the Gibbs sampling algorithm samples only the latent variables (i.e., topic assignments of words in the training corpus). To my understand, this is how Gibbs sampling in LDA is "collapsed".
No comments:
Post a Comment