The year 2009 is the Darwin bicentennial and the sesquicentennial of the publication of the Origin of Species, but, um, even more importantly, it is the International Year of Astronomy, celebrating 400 orbits since Galileo started to look through a telescope.


Borel Cantelli Lemma for the Gaussian World

Almost two year long scrutinizing some publications by astronomers gave me enough impression that astronomers live in the Gaussian world. You are likely to object this statement by saying that astronomers know and use Poisson, binomial, Pareto (power laws), Weibull, exponential, Laplace (Cauchy), Gamma, and some other distributions.[1] This is true. I witness that these distributions are referred in many publications; however, when it comes to obtaining “BEST FIT estimates for the parameters of interest” and “their ERROR (BARS)”, suddenly everything goes back to the Gaussian world.[2]

Borel Cantelli Lemma (from Planet Math): because of mathematical symbols, a link was made but any probability books have the lemma with proofs and descriptions.

  1. It is a bit disappointing fact that not many mention the t distribution, even though less than 30 observations are available.[]
  2. To stay off this Gaussian world, some astronomers rely on Bayesian statistics and explicitly say that it is the only escape, which is sometimes true and sometimes not – I personally weigh more that Bayesians are not always more robust than frequentist methods as opposed to astronomers’ discussion about robust methods.[]

[SPS] Testing Completeness

It bothers me.

  1. Note that the current sherpa is beta under ciao 4.0 not under ciao 3.4 and a description about “bayes” from the most recent sherpa is not available yet, which means this post needs updates one new release is available[]

after “Thanks to Henrietta Leavitt”

“Thanks to Henrietta Leavitt”


The CfA is celebrating the 100th anniversary of the discovery of the Cepheid period-luminosity relation on Nov 6, 2008. See for details.

[Update 10/03] For a nice introduction to the story of Henrietta Swan Leavitt, listen to this Perimeter Institute talk by George Johnson:

[Update 11/06] The full program is now available. The symposium begins at Noon today.

Astroart Survey

Astronomy is known for its pretty pictures, but as Joe the Astronomer would say, those pretty pictures don’t make themselves. A lot of thought goes into maximizing scientific content while conveying just the right information, all discernible at a single glance. So the hardworkin folks at Chandra want your help in figuring out what works and how well, and they have set up a survey at Take the survey, it is both interesting and challenging!


RMF. It is a wørd to strike terror even into the hearts of the intrepid. It refers to the spread in the measured energy of an incoming photon, and even astronomers often stumble over what it is and what it contains. It essentially sets down the measurement error for registering the energy of a photon in the given instrument.

missing data

“planetariums and other foolishness”

Killer App

The iPhone is an amazing device. I have heard that some people use it as a phone, too, but it really is an extraordinary portable computer. It is faster and more powerful than the Sparcstations I used as a grad student, and will fit into your pocket. And most importantly, you can fit an entire planetarium on it.

The Big Picture

Our hometown rag (the Boston Globe) runs an occasional series of photo collections that highlight news stories called The Big Picture. This week, they take a look at the Sun:

The pictures come from space and ground observatories, from SoHO, TRACE, Hinode, STEREO, etc. Goes without saying, the images are stunning, and some are even animated. The real kicker is that images such as these are being acquired by the hundreds, every hour upon the hour, 24/7/365.25 . It is like sipping from a firehose. Nobody can sit there and look at them all, so who knows what we are missing out on. Can statistics help? Can we automate a statistically robust “interestingness” criterion to filter the data stream that humans can then follow up on?

A Quote on Model

In order to understand a learning procedure statistically it is necessary to identify two important aspects: its structural model and its error model. The former is most important since it determines the function space of the approximator, thereby characterizing the class of functions or hypothesis that can be accurately approximated with it. The error model specifies the distribution of random departures of sampled data from the structural model.

