The AstroStat Slog » Banff Challenge http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 The Banff Challenge [Eqn] http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-banff-challenge/ http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-banff-challenge/#comments Wed, 23 Jul 2008 17:00:48 +0000 vlk http://hea-www.harvard.edu/AstroStat/slog/?p=357 With the LHC coming on line anon, it is appropriate to highlight the Banff Challenge, which was designed as a way to figure out how to place bounds on the mass of the Higgs boson. The equations that were to be solved are quite general, and are in fact the first attempt that I know of where calibration data are directly and explicitly included in the analysis.

The observables are counts N, Y, and Z, with

N ~ Pois(ε λS + λB) ,
Y ~ Pois(ρ λB)
,
Z ~ Pois(ε υ)
,

where λS is the parameter of interest (in this case, the mass of the Higgs boson, but could be the intensity of a source), λB is the parameter that describes the background, ε is the efficiency, or the effective area, of the detector, and υ is a calibrator source with a known intensity.

The challenge was (is) to infer the maximum likelihood estimate of and the bounds on λS, given the observed data, {N, Y, Z}. In other words, to compute

p(λS|N,Y,Z) .

It may look like an easy problem, but it isn’t!

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-banff-challenge/feed/ 4
[ArXiv] 3rd week, Dec. 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-3rd-week-dec-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-3rd-week-dec-2007/#comments Fri, 21 Dec 2007 18:40:09 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-3rd-week-dec-2007/ The paper about the Banff challenge [0712.2708] and the statistics tutorial for cosmologists [0712.3028] are the personal recommendations from this week’s [arXiv] list. Especially, I’d like to quote from Licia Verde’s [astro-ph:0712.3028],

In general, Cosmologists are Bayesians and High Energy Physicists are Frequentists.

I thought it was opposite. By the way, if you crave for more papers, click

  • [astro-ph:0712.2544]
    RHESSI Microflare Statistics II. X-ray Imaging, Spectroscopy & Energy Distributions I. G. Hannah et.al.

  • [stat.AP;0712.2708]
    The Banff Challenge: Statistical Detection of a Noisy Signal A. C. Davison & N. Sartori

  • [astro-ph:0712.2898]
    A study of supervised classification of Hipparcos variable stars using PCA and Support Vector Machines P.G. Willemsen & L. Eyer

  • [astro-ph:0712.2961]
    The frequency distribution of the height above the Galactic plane for the novae M. Burlak

  • [astro-ph:0712.3028]
    A practical guide to Basic Statistical Techniques for Data Analysis in Cosmology L. Verde

  • [astro-ph:0712.3049]
    ZOBOV: a parameter-free void-finding algorithm M. C. Neyrinck

  • [stat.CO:0712.3056]
    Gibbs Sampling for a Bayesian Hierarchical Version of the General Linear Mixed Model A. A. Johnson & G L. Jones

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-3rd-week-dec-2007/feed/ 0
When you observed zero counts, you didn’t not observe any counts http://hea-www.harvard.edu/AstroStat/slog/2007/zero-counts/ http://hea-www.harvard.edu/AstroStat/slog/2007/zero-counts/#comments Mon, 24 Sep 2007 00:28:15 +0000 vlk http://hea-www.harvard.edu/AstroStat/slog/2007/zero-counts/ Dong-Woo, who has been playing with BEHR, noticed that the confidence bounds quoted on the source intensities seem to be unchanged when the source counts are zero, regardless of what the background counts are set to. That is, p(s|NS,NB) is invariant when NS=0, for any value of NB. This seems a bit odd, because [naively] one expects that as NB increases, it should/ought to get more and more likely that s gets closer to 0.

Suppose you compute the posterior probability distribution of the intensity of a source, s, when the data include counts in a source region (NS) and counts in a background region (NB). When NS=0, i.e., no counts are observed in the source region,

p(s|NS=0, NB) = (1+b)a/Gamma(a) * sa-1 * e-s*(1+b),

where a,b are the parameters of a gamma prior.

Why does NB have no effect? Because when you have zero counts, the entire effect of the background is going towards evaluating how good the actual chosen model is (so it is become a model comparison problem, not a parameter estimation one), and not into estimating the parameter of interest, the source intensity. That is, into the normalization factor of the probability distribution, p(NS,NB). Those parts that depend on NB cancel out when the expression for p(s|NS,NB) is written out because the shape is independent of NB and the pdf must integrate to 1.

No doubt this is obvious, but I hadn’t noticed it before.

PS: Also shows why upper limits should not be identified with upper confidence bounds.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/zero-counts/feed/ 7