Archive for the ‘Bad AstroStat’ Category.

you are biased, I have an informative prior”

Hyunsook drew attention to this paper (arXiv:0709.4531v1) by Brad Schaefer on the underdispersed measurements of the distances to LMC. He makes a compelling case that since 2002 published numbers in the literature have been hewing to an “acceptable number”, possibly in an unconscious effort to pass muster with their referees. Essentially, the distribution of the best-fit distances are much more closely clustered than you would expect from the quoted sizes of the error bars. Continue reading ‘“you are biased, I have an informative prior”’ »

All your bias are belong to us

Leccardi & Molendi (2007) have a paper in A&A (astro-ph/0705.4199) discussing the biases in parameter estimation when spectral fitting is confronted with low counts data. Not surprisingly, they find that the bias is higher for lower counts, for standard chisq compared to C-stat, for grouped data compared to ungrouped. Peter Freeman talked about something like this at the 2003 X-ray Astronomy School at Wallops Island (pdf1, pdf2), and no doubt part of the problem also has to do with the (un)reliability of the fitting process when the chisq surface gets complicated.

Anyway, they propose an empirical method to reduce the bias by computing the probability distribution functions (pdfs) for various simulations, and then averaging the pdfs in groups of 3. Seems to work, for reasons that escape me completely.

[Update: links to Peter's slides corrected]

On the unreliability of fitting

Despite some recent significant advances in Statistics and its applications to Astronomy (Cash 1976, Cash 1979, Gehrels 1984, Schmitt 1985, Isobe et al. 1986, van Dyk et al. 2001, Protassov et al. 2002, etc.), there still exist numerous problems and limitations in the standard statistical methodologies that are routinely applied to astrophysical data. For instance, the basic algorithms used in non-linear curve-fitting in spectra and images have remained unchanged since the 1960′s: the downhill simplex method of Nelder & Mead (1965) modified by Powell, and methods of steepest descent exemplified by Levenberg-Marquardt (Marquardt 1963). All non-linear curve-fitting programs currently in general use (Sherpa, XSPEC, MPFIT, PINTofALE, etc.) with the exception of Monte Carlo and MCMC methods are implementations based on these algorithms and thus share their limitations.
Continue reading ‘On the unreliability of fitting’ »