Quote of the Week, Aug 31, 2007

Once again, the middle of a recent (Aug 30-31, 2007) argument within CHASC, on why physicists and astronomers view “3 sigma” results with suspicion and expect (roughly) > 5 sigma; while statisticians and biologists typically assume 95% is OK:

David van Dyk (representing statistics culture):

Can’t you look at it again? Collect more data?

Vinay Kashyap (representing astronomy and physics culture):

…I can confidently answer this question: no, alas, we usually cannot look at it again!!

Ah. Hmm. To rephrase [the question]: if you have a “7.5 sigma” feature, with a day-long [imaging Markov Chain Monte Carlo] run you can only show that it is “>3sigma”, but is it possible, even with that day-long run, to tell that the feature is really at 7.5sigma — is that the question? Well that would be nice, but I don’t understand how observing again will help?

David van Dyk :

No one believes any realistic test is properly calibrated that far into the tail. Using 5-sigma is really just a high bar, but the precise calibration will never be done. (This is a reason not to sweet the computation TOO much.)

Most other scientific areas set the bar lower (2 or 3 sigma) BUT don’t really believe the results unless they are replicated.

My assertion is that I find replicated results more convincing than extreme p-values. And the controversial part: Astronomers should aim for replication rather than worry about 5-sigma.

  1. vlk:

    Well I wouldn’t go so far as to say that astronomers don’t believe in 3-sigma results — that is usually the acceptable threshold for source detection after all! But yes, for anything out of the ordinary (i.e., That-Which-Cannot-Be-Repeated), the bar is set higher.

    But the funny thing is, while gatecrashing the JSM last month, I kept noticing how statisticians seem to be so gung ho on repeatability and replicability — of the results, of the methods, even of the numbers. I almost wrote up a post about this apparent cultural difference, and then had second thoughts, thinking, nah, surely that must be simply anecdotal bias.

    09-02-2007, 1:33 pm
  2. Aneta Siemiginowska:

    Well, in astronomy unknown calibration or measurement’s uncertainties make a 3 sigma result not so convincing. There have been many discussion about some 2-3 sigma deviations in the high resolution
    X-ray spectra as to whether these are true emission/absorption lines or just unknown calibration features. With our improved knowledge of instruments and better calibrations some of these 2-3 sigma features have disappeared and became a noise…

    09-03-2007, 10:10 pm
  3. vlk:

    I think that part of the reason 3 sigma is not reliable in spectral analysis is because of the number of bins. In an HETG spectrum there are 8192 bins, so a 3 sigma threshold can leave you with about 10 false positives (well fewer because of the finite width of the line spread function, but you get the idea). And it only gets worse with 16384-bin LETG spectra.

    09-04-2007, 3:56 pm
Leave a comment