you are biased, I have an informative prior”

Hyunsook drew attention to this paper (arXiv:0709.4531v1) by Brad Schaefer on the underdispersed measurements of the distances to LMC. He makes a compelling case that since 2002 published numbers in the literature have been hewing to an “acceptable number”, possibly in an unconscious effort to pass muster with their referees. Essentially, the distribution of the best-fit distances are much more closely clustered than you would expect from the quoted sizes of the error bars.

To be sure, there are other possible reasons for this underdispersion, such as correlations in how the data are gathered and analyzed, and an overly conservative estimation of error bars, etc. In fact, the most benign explanation is probably in how people carry out “sanity checks” and tend to discard or explain away or correct the data that give odd results.

While this is indeed worrisome, I am inclined to think that this is not wrong per se, but rather a case where a fully Bayesian analysis would give the “right” coverage. After all, there does exist a strong prior that people are bringing into the analysis, but are not including in the calculations of the widths of the posterior probability distributions. Including such a highly informative prior will of course shrink the sizes of the error bars and make everything consistent. i.e., I think that the assumption needs to be explicit, that is all. Is that bias? bandwagon? or prior belief?

Leave a comment