An alternative to MCMC?

vlk — Sun, 19 Aug 2007 04:31:09 +0000

I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better?

Feroz & Hobson (2007, arXiv:0704.3704) describe a technique called Nested Sampling (Skilling 2004), one that could give MCMC a run for its money. It takes the one inefficient part of MCMC — the burn-in phase — and turns that into a virtue. The way it seems to work is to keep track of how the parameter space is traversed as the model parameters {theta} reach the mode of the posterior, and to take the sequence of likelihoods thus obtained L(theta), and turn it around to get theta(L). Neat.

Two big (computational) problems that I see are (1) the calculation of theta(L), and (2) the sampling to discard the tail of L(theta). The former, it seems to me, becomes intractable exactly when the likelihood surface gets complicated. The latter, again, it seems you have to run through just as many iterations as in MCMC to get a decent sample size. Of course, if you have a good theta(L), it does seem to be an improvement over MCMC in that you won’t need to run the chains multiple times to make sure you catch all the modes.

I think the main advantage of MCMC is that it produces and keeps track of marginalized posteriors for each parameter, whereas in this case, you have to essentially keep a full list of samples from the joint posterior and then marginalize over it yourself. The larger the sample size, the harder this gets, and in fact it is a bit difficult to tell whether the nested sampling method is competing with MCMC or Monte Carlo integration.

Is there any reason why this should not be combined with MCMC? i.e., can we use nested sampling from the burn-in phase to figure out the proposal distributions for Metropolis or Metropolis-Hastings in situ, and get the best of both worlds?

The AstroStat Slog » Skilling

An alternative to MCMC?