Comments on: Did they, or didn’t they? http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/ Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 01 Jun 2012 18:47:52 +0000 hourly 1 http://wordpress.org/?v=3.4 By: vlk http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/comment-page-1/#comment-233 vlk Thu, 22 May 2008 23:24:16 +0000 http://hea-www.harvard.edu/AstroStat/slog/?p=317#comment-233 Thanks, Tom. I've edited the main post a bit. The language of probability theory is very intricate, and one "summarizes" at their peril! I'm glad you are keeping us honest. I have to check the paper carefully, but I think though that the 1% number in Roelofs et al does not refer to offsets of "this magnitude or larger", but rather to a small range of offsets defined by the error on the measured offset (see their Fig 3). Thanks, Tom. I’ve edited the main post a bit. The language of probability theory is very intricate, and one “summarizes” at their peril! I’m glad you are keeping us honest.

I have to check the paper carefully, but I think though that the 1% number in Roelofs et al does not refer to offsets of “this magnitude or larger”, but rather to a small range of offsets defined by the error on the measured offset (see their Fig 3).

]]>
By: TomLoredo http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/comment-page-1/#comment-232 TomLoredo Thu, 22 May 2008 17:07:09 +0000 http://hea-www.harvard.edu/AstroStat/slog/?p=317#comment-232 Vinay, thanks for the exact quote. I think what they meant to say is, "the probability of finding an offset of this magnitude <em>or larger</em>..." It's a shame that got by the referees (this kind of detail often does). It does look like a power calculation, in the sense of being based on the <em>alternative</em> of genuine association. Though I know <em>you</em> know it, for visitors I think it's worth emphasizing that this is not at all the same thing as saying, "the probability that the two locations are the same is ~1%." The latter is a Bayesian statement (i.e., it reports a probability for a hypothesis about the true locations, instead of reporting the fraction of time with which a procedure would reject a hypothesis). In general there is no simple relationship between such probabilities, though in special cases they may be related. Vinay, thanks for the exact quote. I think what they meant to say is, “the probability of finding an offset of this magnitude or larger…” It’s a shame that got by the referees (this kind of detail often does). It does look like a power calculation, in the sense of being based on the alternative of genuine association. Though I know you know it, for visitors I think it’s worth emphasizing that this is not at all the same thing as saying, “the probability that the two locations are the same is ~1%.” The latter is a Bayesian statement (i.e., it reports a probability for a hypothesis about the true locations, instead of reporting the fraction of time with which a procedure would reject a hypothesis). In general there is no simple relationship between such probabilities, though in special cases they may be related.

]]>
By: vlk http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/comment-page-1/#comment-231 vlk Thu, 22 May 2008 05:49:36 +0000 http://hea-www.harvard.edu/AstroStat/slog/?p=317#comment-231 Tom, yes, the interesting part is indeed that the two probabilities are different things -- different numbers, differently arrived at, answering different questions. But they are the products of commonly used techniques, and use essentially the same data. It seems to me that the first one is a significance test (the probability that an unrelated X-ray source can show up nearby by chance), and the second one is a power calculation (the probability that a true association will be flagged as false for the observed separation). I am unsure how to interpret the combination though. As you say, perhaps a full Bayesian calculation is necessary to make sense of it. The quote above was a paraphrase, btw. The exact quote is <em>"Extensive simulations of the Chandra data show that the probability of finding an offset of this magnitude is ~1%, equal to the (trial-corrected) probability of a chance alignment with any X-ray source in the field."</em> Tom, yes, the interesting part is indeed that the two probabilities are different things — different numbers, differently arrived at, answering different questions. But they are the products of commonly used techniques, and use essentially the same data. It seems to me that the first one is a significance test (the probability that an unrelated X-ray source can show up nearby by chance), and the second one is a power calculation (the probability that a true association will be flagged as false for the observed separation). I am unsure how to interpret the combination though. As you say, perhaps a full Bayesian calculation is necessary to make sense of it.

The quote above was a paraphrase, btw. The exact quote is “Extensive simulations of the Chandra data show that the probability of finding an offset of this magnitude is ~1%, equal to the (trial-corrected) probability of a chance alignment with any X-ray source in the field.”

]]>
By: TomLoredo http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/comment-page-1/#comment-228 TomLoredo Thu, 22 May 2008 03:36:31 +0000 http://hea-www.harvard.edu/AstroStat/slog/?p=317#comment-228 Hyunsook is right, that the two probabilities <em>as you quoted them</em> are of different things, and thus the answers to different questions. But I wonder if the quote is accurate (i.e., the authors' inaccurate description of what they calculated, or perhaps a misquote here)? The last one—"the probability that the two locations are the same is ~1%"—is a Bayesian statement. Did they really do a Bayesian calculation? Or did they calculate a significance level of some kind and just incorrectly describe it with Bayesian language? Well, I'll have to go look at the papers. I couldn't resist commenting on it, however, because I came up with a Bayesian approach for assessing directional (and more generally, spatio-temporal) coincidences quite a few years ago (inspired by a GRB problem), and I'll be using it as a pedagogical example at the CASt summer school in just a few weeks. The exercise compares the behavior of these two quantities (a p-value for a hypothesis test, and the posterior odds for coincidence vs. no coincidence). I'm also waiting (on pins and needles—news should be imminent) to see if an NSF proposal that, in part, seeks to develop MCMC-flavored algorithms for implementing Bayesian coincidence assessment with large data sets will get funded. We'll see.... Anyone, one of the lessons of the toy computation for CASt is that a p-value can reject the null hypothesis of no true association (i.e., conclude there <em>is</em> an association) where the Bayesian calculation favors the null. The reason is that some data may be rather improbable under the null (thus leading to rejection in a significance test), yet similarly improbable under the alternative (here there is a definite alternative: association); a Bayes factor can thus say that data with a small p-value nevertheless does not significantly favor the alternative. An explicit (and often messy) power calculation might spare the significance test fans embarassment, but no one does them. The Bayes factor nicely puts all you need into a single quantity, with the usual "Occam factor" machinery coming in to play to help things out. Hyunsook is right, that the two probabilities as you quoted them are of different things, and thus the answers to different questions. But I wonder if the quote is accurate (i.e., the authors’ inaccurate description of what they calculated, or perhaps a misquote here)? The last one—”the probability that the two locations are the same is ~1%”—is a Bayesian statement. Did they really do a Bayesian calculation? Or did they calculate a significance level of some kind and just incorrectly describe it with Bayesian language? Well, I’ll have to go look at the papers.

I couldn’t resist commenting on it, however, because I came up with a Bayesian approach for assessing directional (and more generally, spatio-temporal) coincidences quite a few years ago (inspired by a GRB problem), and I’ll be using it as a pedagogical example at the CASt summer school in just a few weeks. The exercise compares the behavior of these two quantities (a p-value for a hypothesis test, and the posterior odds for coincidence vs. no coincidence). I’m also waiting (on pins and needles—news should be imminent) to see if an NSF proposal that, in part, seeks to develop MCMC-flavored algorithms for implementing Bayesian coincidence assessment with large data sets will get funded. We’ll see….

Anyone, one of the lessons of the toy computation for CASt is that a p-value can reject the null hypothesis of no true association (i.e., conclude there is an association) where the Bayesian calculation favors the null. The reason is that some data may be rather improbable under the null (thus leading to rejection in a significance test), yet similarly improbable under the alternative (here there is a definite alternative: association); a Bayes factor can thus say that data with a small p-value nevertheless does not significantly favor the alternative. An explicit (and often messy) power calculation might spare the significance test fans embarassment, but no one does them. The Bayes factor nicely puts all you need into a single quantity, with the usual “Occam factor” machinery coming in to play to help things out.

]]>
By: hlee http://hea-www.harvard.edu/AstroStat/slog/2008/type1a-progenitor/comment-page-1/#comment-227 hlee Tue, 20 May 2008 21:25:12 +0000 http://hea-www.harvard.edu/AstroStat/slog/?p=317#comment-227 Not a bona fide and knowledgeable statistician, but one thing I noticed is two different hypotheses to test the same astrophysical discovery. One is <u>the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3%</u> and the other is <u>the probability that the two locations are the same is ~1%</u>. The first one focuses a radius around the given location and the latter focuses on the coincidence of two events. A brief glance of the post reminds me Buffon's needle (some paradox from stochastic geometry) although I'm not sure if there's an analogy between the discovery of the supernova and the coverage probability. Not a bona fide and knowledgeable statistician, but one thing I noticed is two different hypotheses to test the same astrophysical discovery. One is the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3% and the other is the probability that the two locations are the same is ~1%. The first one focuses a radius around the given location and the latter focuses on the coincidence of two events. A brief glance of the post reminds me Buffon’s needle (some paradox from stochastic geometry) although I’m not sure if there’s an analogy between the discovery of the supernova and the coverage probability.

]]>