## Dec 27, 2008

### Stochastic Sampling

The basic goal of stochastic sampling is to draw random samples from a given distribution p(x). Sometimes, p(x) is not exactly known, for example, knowing only f(x)=p(x)/K.

Given the ability of stochastic sampling, we may do the following things: (i) approximate p(x), (ii) integrate p(x), and (iii) optimize p(x).

A difficulty for stochastic sampling is that the form of p(x) or f(x) is usually complex, however, we can only sample directly from some easy-formed distributions, like discrete distribution, uniform distribution or unit Gaussian.

So, usually we sample from an easy-to-sample distribution q(x) instead of from p(x) or f(x), then we either assign each sample point a "weight", or an "acceptance probability" which decides whether we accept this point as a sample from p(x) or f(x).

Some basic sampling methods include:
• Monte Carlo: Sample from q(x). No weight or acceptance probability is calculated. Just pretends that samples from q(x) is from p(x), and use the samples to compute the integration of p(x).
• Weighted Sampling: Sample from q(x). Assign each sample x a weight w=p(x)/q(x). Use the weighted samples to compute the integration of p(x).
• Rejection Sampling: Sample from q(x), where q(x) > p(x) in the whole domain of p(x). Accept each sample x by an acceptance probability a=p(x)/q(x).
In above methods, we have to choose q(x) which is easy-to-sample and approximate p(x). These two requirements are often contradictory. In order not to be bothered by choosing q(x), we may resort to Markov Chain Monte Carlo methods, which constructively approach p(x) without q(x) along a Markov chain.
Few more words for simulated annealing: to take it as a sampling method, we decrease T down to 1; to make it an optimization method, we decrease T down to 0.

References:
1. B. Walsh. Markov Chain Monte Carlo and Gibbs Sampling. Lecture notes. 2004.
2. David J.C. MacKay. Information Theory, Inference, and Learning Algorithms. Chapter 29 and Chapter 30.