Comments on: An example of chi2 bias in fitting the X-ray spectra. http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/ Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 01 Jun 2012 18:47:52 +0000 hourly 1 http://wordpress.org/?v=3.4 By: fillh http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-907 fillh Thu, 03 Sep 2009 01:52:41 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-907 It's important to make clear this is a problem with using approximations to the true chi-square distribution for fitting Poisson-distributed data! Jading & Riisager (1996, Nuc Inst Meth Phys Res A 372, 289) do a very nice, analytical calculation of the magnitude of this effect for a particular (very simple) problem (the count-rate of a lightcurve), using both Pearson's and Neyman's approximations to chi-square, finding asymptotically you get a bias of around 1 count per bin. A more general analytical discussion of the causes and magnitude of the bias for an arbitrary model is given in Humphrey, Liu & Buote (2009, ApJ 693, 822). Basically, the order of magnitude of the bias divided by the statistical error should be of order the number of bins divided by the square root of the number of photons. Fits using Pearson's approximation to chi-square yield a bias of approximately -0.5 times the bias seen with Neyman's approximation, and the bias on fits using the C-statistic is much smaller than the statistical error (roughly what Aneta's plot showed). In summary--- don't use approximations to Chi-square, but use C-statistic if you're fitting Poisson distributed data! It’s important to make clear this is a problem with using approximations to the true chi-square distribution for fitting Poisson-distributed data! Jading & Riisager (1996, Nuc Inst Meth Phys Res A 372, 289) do a very nice, analytical calculation of the magnitude of this effect for a particular (very simple) problem (the count-rate of a lightcurve), using both Pearson’s and Neyman’s approximations to chi-square, finding asymptotically you get a bias of around 1 count per bin.

A more general analytical discussion of the causes and magnitude of the bias for an arbitrary model is given in Humphrey, Liu & Buote (2009, ApJ 693, 822). Basically, the order of magnitude of the bias divided by the statistical error should be of order the number of bins divided by the square root of the number of photons. Fits using Pearson’s approximation to chi-square yield a bias of approximately -0.5 times the bias seen with Neyman’s approximation, and the bias on fits using the C-statistic is much smaller than the statistical error (roughly what Aneta’s plot showed).

In summary— don’t use approximations to Chi-square, but use C-statistic if you’re fitting Poisson distributed data!

]]>
By: hlee http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-798 hlee Tue, 14 Oct 2008 21:15:10 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-798 Details about this bias can be found from <a href="http://adsabs.harvard.edu/abs/1999ApJ...518..380M" rel="nofollow">Parameter Estimation in Astronomy with Poisson-Distribution Data I. The χ<sub>r</sub><sup>2</sup> Statistics</a> by K.J. Mighell (1999ApJ...518..380). I overlooked references in <a href="http://adsabs.harvard.edu/abs/2001ApJ...548..224V" rel="nofollow">Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation</a> by van Dyk et al (2001ApJ...548..224). I wish that reference lists in astronomical publication include the titles so as to infer the significance of citation. Details about this bias can be found from Parameter Estimation in Astronomy with Poisson-Distribution Data I. The χr2 Statistics by K.J. Mighell (1999ApJ…518..380). I overlooked references in Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation by van Dyk et al (2001ApJ…548..224). I wish that reference lists in astronomical publication include the titles so as to infer the significance of citation.

]]>
By: jk http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-131 jk Wed, 21 Nov 2007 08:42:30 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-131 If you are willing to assume a likelihood, the optimal estimator is the one that maximizes the likelihood function, denoted the MLE. It is optimal in terms of efficiency ( the asymptotic variance of your estimator ). It is not, however, always going to be unbiased. However, under mild regularity conditions, the MLE is consistent, which is essentially the same as being asymptotically unbiased. If the bias is determined to be an issue, sometimes the bias can be easily corrected, take for example using s^2 to estimate sigma^2 for the case of i.i.d. normal data. If you are willing to assume a likelihood, the optimal estimator is the one that maximizes the likelihood function, denoted the MLE. It is optimal in terms of efficiency ( the asymptotic variance of your estimator ). It is not, however, always going to be unbiased. However, under mild regularity conditions, the MLE is consistent, which is essentially the same as being asymptotically unbiased. If the bias is determined to be an issue, sometimes the bias can be easily corrected, take for example using s^2 to estimate sigma^2 for the case of i.i.d. normal data.

]]>
By: aneta http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-129 aneta Thu, 08 Nov 2007 01:10:30 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-129 This is also described in Lupton book on Statistic for Astronomers in the section on the ML estimators. This is also described in Lupton book on Statistic for Astronomers in the section
on the ML estimators.

]]>
By: hlee http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-128 hlee Wed, 07 Nov 2007 16:17:13 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-128 1. The optimal estimator: Can I take that as BLUE (best linear unbiased estimator)? 2. Sad and glad that it's already done. The plot and Loredo's work should go more public. 1. The optimal estimator: Can I take that as BLUE (best linear unbiased estimator)?
2. Sad and glad that it’s already done. The plot and Loredo’s work should go more public.

]]>
By: pef http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-127 pef Wed, 07 Nov 2007 13:59:46 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-127 Ah. Tom L: if you're reading this, you really should've published those notes :) The short version: the optimal estimator for the photon index is the maximum likelihood estimator. If the data are binned, and the total number of counts over all bins is not fixed (i.e., is a random variable), then the likelihood function is \Prod_{i=1}^k \frac{ (np_i)^{y_i} }{ y_i ! } e^{-np_i} where y_i are the number of counts in bin i, n = \sum_i y_i, and p_i is the probability that a count would be recorded in bin i (and this depends on the distribution parameter, in this case the photon index). You can derive the \chi^2 function from this using Stirling's approximation (np_i \gtrsim 5 in each bin) and then a Taylor series expansion. Depending on how you do that expansion, you can derive either \chi^2 with model variance or \chi^2 with data variance. It's the combination of Stirling's approximation with how you cut off the Taylor expansion that creates the bias term. Tom Loredo wrote this all up years and years ago, so I take no credit for the explanation. Ah.

Tom L: if you’re reading this, you really should’ve published those notes

The short version: the optimal estimator for the photon index is the maximum
likelihood estimator. If the data are binned, and the total number of counts over
all bins is not fixed (i.e., is a random variable), then the likelihood function is

\Prod_{i=1}^k \frac{ (np_i)^{y_i} }{ y_i ! } e^{-np_i}

where y_i are the number of counts in bin i, n = \sum_i y_i, and p_i is the probability
that a count would be recorded in bin i (and this depends on the distribution
parameter, in this case the photon index).

You can derive the \chi^2 function from this using Stirling’s approximation
(np_i \gtrsim 5 in each bin) and then a Taylor series expansion. Depending on
how you do that expansion, you can derive either \chi^2 with model variance
or \chi^2 with data variance. It’s the combination of Stirling’s approximation
with how you cut off the Taylor expansion that creates the bias term.

Tom Loredo wrote this all up years and years ago, so I take no credit for the
explanation.

]]>
By: hlee http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-126 hlee Wed, 07 Nov 2007 01:17:15 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-126 Some sort of expansion like Taylor could characterize the bias term. Cash stat (maximum likelihood estimator) is asymptotically unbiased under mild regularity conditions but I do not think the best fit from the chi-sq function is. I guess there are ways to introduce penalized likelihoods to reduce bias (get rid of bias) designed for astronomers to get unbiased best fits. It will take time to build a connection between physical intuition and mathematical formalism, though. Some sort of expansion like Taylor could characterize the bias term. Cash stat (maximum likelihood estimator) is asymptotically unbiased under mild regularity conditions but I do not think the best fit from the chi-sq function is. I guess there are ways to introduce penalized likelihoods to reduce bias (get rid of bias) designed for astronomers to get unbiased best fits. It will take time to build a connection between physical intuition and mathematical formalism, though.

]]>
By: vlk http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/comment-page-1/#comment-125 vlk Tue, 06 Nov 2007 22:06:55 +0000 http://hea-www.harvard.edu/AstroStat/slog/2007/an-example-of-chi2-bias-in-fitting-the-x-ray-spectra/#comment-125 Any idea what is the primal cause of this bias? How would one understand this from a physical viewpoint? i.e., how to build intuition about it? Any idea what is the primal cause of this bias? How would one understand this from a physical viewpoint? i.e., how to build intuition about it?

]]>