#### [ArXiv] 2nd week, Mar. 2008

Warning! The list is long this week but diverse. Some are of CHASC’s obvious interest. Continue reading ‘[ArXiv] 2nd week, Mar. 2008’ »

Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders

Archive for the ‘MC’ Category.

Warning! The list is long this week but diverse. Some are of CHASC’s obvious interest. Continue reading ‘[ArXiv] 2nd week, Mar. 2008’ »

Irrelevant to astrostatistics but interesting for baseball lovers.

[stat.AP:0802.4317] Jensen, Shirley, & Wyner

**Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball**

With the 5th year WMAP data release, there were many WMAP related papers and among them, most statistical papers are listed. Continue reading ‘[ArXiv] 1st week, Mar. 2008’ »

This is a quite long paper that I separated from [Arvix] 4th week, Feb. 2008:

[astro-ph:0802.3916] P. Carvalho, G. Rocha, & M.P.Hobso

**A fast Bayesian approach to discrete object detection in astronomical datasets – PowellSnakes I **

As the title suggests, it describes Bayesian source detection and provides me a chance to learn the foundation of source detection in astronomy. Continue reading ‘[ArXiv] A fast Bayesian object detection’ »

Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.

IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing

This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.

Continue reading ‘Signal Processing and Bootstrap’ »

One of the big problems that has come up in recent years is in how to represent the uncertainty in certain estimates. Astronomers usually present errors as *+-stddev* on the quantities of interest, but that presupposes that the errors are uncorrelated. But suppose you are estimating a multi-dimensional set of parameters that may have large correlations amongst themselves? One such case is that of Differential Emission Measures (DEM), where the “quantity of emission” from a plasma (loosely, how much stuff there is available to emit — it is the product of the volume and the densities of electrons and H) is estimated for different temperatures. See the plots at the PoA DEM tutorial for examples of how we are currently trying to visualize the error bars. Another example is the correlated systematic uncertainties in effective areas (Drake et al., 2005, Chandra Cal Workshop). This is not dissimilar to the problem of determining the significance of a “feature” in an image (Connors, A. & van Dyk, D.A., 2007, SCMA IV). Continue reading ‘Dance of the Errors’ »

In the past month, I’ve noticed relatively frequent paper appearance in arxiv/astro-ph whose title includes **Bayesian** or **Markov Chain Monte Carlo (MCMC)**. Those papers are:

- [astro-ph:0709.1058v1]
**Joint Bayesian Component Separation and CMB Power Spectrum Estimation**by H.K.Eriksen et. al. - [astro-ph:0709.1104v1]
**Monolithic or hierarchical star formation? A new statistical analysis**by M. Kampakoglou, R. Trotta, and J. Silk - [astro-ph:0411573v2]
**A Bayesian analysis of the primordial power spectrum**by M.Bridges, A.N.Lasenby, M.P.Hobson - [astro-ph:0709.0596v1]
**Bayesian inversion of Stokes profiles**by A. A. Ramos, M.J.M. Gonzales, and J.A. Rubino-Martin - [astro-ph:0709.0711v1]
**Bayesian posterior classification of planetary nebulae according to the Peimbert types**by C. Quireza, H.J.Rocha-Pinto, and W.J. Maciel - [astro-ph:0708.2340v1]
**Bayesian Galaxy Shape Measurement for Weak Lensing Surveys -I. Methodology and a Fast Fitting Algorithm**by L. Miller et. al. - [astro-ph:0708.1871v1]
**Dark energy and cosmic curvature: Monte-Carlo Markov Chain approach**by Y. Gong et. al.

Continue reading ‘[ArXiv] Recent bayesian studies from astro-ph’ »

Once again, the middle of a recent (Aug 30-31, 2007) argument within CHASC, on why physicists and astronomers view “3 sigma” results with suspicion and expect (roughly) > 5 sigma; while statisticians and biologists typically assume 95% is OK:

David van Dyk (representing statistics culture):

Can’t you look at it again? Collect more data?

Vinay Kashyap (representing astronomy and physics culture):

…I can confidently answer this question: no, alas, we usually cannot look at it again!!

Ah. Hmm. To rephrase [the question]: if you have a “7.5 sigma” feature, with a day-long [imaging Markov Chain Monte Carlo] run you can only show that it is “>3sigma”, but is it possible, even with that day-long run, to tell that the feature is

really at 7.5sigma — is that the question? Well that would be nice, but I don’t understand how observing again will help?

No one believes any realistic test is properly calibrated that far into the tail. Using 5-sigma is really just a high bar, but the precise calibration will never be done. (This is a reason not to sweet the computation TOO much.)

Most other scientific areas set the bar lower (2 or 3 sigma) BUT don’t really believe the results unless they are replicated.

My assertion is that I find replicated results more convincing than extreme p-values. And the controversial part: Astronomers should aim for replication rather than worry about 5-sigma.

I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better? Continue reading ‘An alternative to MCMC?’ »

From arxiv/astro-ph:0705.3856

** Gamma-ray albedo of the moon** by Moskalenko and Porter

The title sounds very interesting although the significance of albedo spectra is not recognized by a statistician. This study was performed to utilize GLAST and PAMELA via Monte Carlo simulations (the toolkit for MC was GEANT 8.2) with EGRET data.

This is the second a series of quotes by

Xiao Li Meng , from an introduction to Markov Chain Monte Carlo (MCMC), given to a room full of astronomers, as part of the April 25, 2006 joint meeting of Harvard’s “Stat 310″ and the California-Harvard Astrostatistics Collaboration. This one has a long summary as the lead-in, but hang in there!

Summary first (from earlier in Xiao Li Meng’s presentation):

Let us tackle a harder problem, with the Metropolis Hastings Algorithm.

An example: a tougher distribution, not Normal in [at least one of the dimensions], and multi-modal… FIRST I propose a draw, from an approximate distribution. THEN I compare it to true distribution, using the ratio of proposal to target distribution. The next draw: tells whether to accept the new draw or stay with the old draw.Our intuition:

1/ For original Metropolis algorithm, it looks “geometric” (In the example, we are sampling “x,z”; if the point falls under our xz curve, accept it.)2/ The speed of algorithm depends on how close you are with the approximation. There is a trade-off with “stickiness”.

Practical questions:

How large should say, N be? This is NOT AN EASY PROBLEM! The KEY difficulty: multiple modes in unknown area. We want to know all (major) modes first, as well as estimates of the surrounding areas… [To handle this,] don’t run a single chain; run multiple chains.

Look at between-chain variance; and within-chain variance. BUT there is no “foolproof” here… The starting point should be as broad as possible. Go somewhere crazy. Then combine, either simply as these are independent; or [in a more complicated way as in Meng and Gellman].

And here’s the Actual Quote of the Week:

[Astrophysicist] Aneta Siemiginowska: How do you make these proposals?

[Statistician] Xiao Li Meng: Call a professional statistician like me.

But seriously – it can be hard. But really you don’t need something perfect. You just need something decent.