The AstroStat Slog » gaussian

Poisson vs Gaussian, Part 2

vlk — Fri, 10 Apr 2009 19:16:31 +0000

Probability density functions are another way of summarizing the consequences of assuming a Gaussian error distribution when the true distribution is Poisson. We can compute the posterior probability of the intensity of a source, when some number of counts are observed in a source region, and the background is estimated using counts observed in a different region. We can then compare it to the equivalent Gaussian.

The figure below (AAS 472.09) compares the pdfs for the Poisson intensity (red curves) and the Gaussian equivalent (black curves) for two cases: when the number of counts in the source region is 50 (top) and 8 (bottom) respectively. In both cases a background of 200 counts collected in an area 40x the source area is used. The hatched region represents the 68% equal-tailed interval for the Poisson case, and the solid horizontal line is the ±1σ width of the equivalent Gaussian.

Clearly, for small counts, the support of the Poisson distribution is bounded below at zero, but that of the Gaussian is not. This introduces a visibly large bias in the interval coverage as well as in the normalization properties. Even at high counts, the Poisson is skewed such that larger values are slightly more likely to occur by chance than in the Gaussian case. This skew can be quite critical for marginal results.

Poisson and Gaussian probability densities

No simple IDL code this time; but for reference, the Poisson posterior probability density curves were generated with the PINTofALE routine ppd_src()

Poisson vs Gaussian

vlk — Thu, 09 Apr 2009 23:01:58 +0000

We astronomers are rather fond of approximating our counting statistics with Gaussian error distributions, and a lot of ink has been spilled justifying and/or denigrating this habit. But just how bad is the approximation anyway?

I ran a simple Monte Carlo based test to compute the expected bias between a Poisson sample and the “equivalent” Gaussian sample. The result is shown in the plot below.

The jagged red line is the fractional expected bias relative to the true intensity. The typical recommendation in high-energy astronomy is to bin up events until there are about 25 or so counts per bin. This leads to an average bias of about 2% in the estimate of the true intensity. The bias drops below 1% for counts >50. The smooth blue line is the reciprocal of the square-root of the intensity, reflecting the width of the Poisson distribution relative to the true intensity, and is given here only for illustrative purposes.

Poisson-Gaussian bias

Exemplar IDL code that can be used to generate this kind of plot is appended below:
nlam=100L & nsim=20000L lam=indgen(nlam)+1 & sct=intarr(nlam,nsim) & scg=sct & dct=fltarr(nlam) for i=0L,nlam-1L do sct[i,*]=randomu(seed,nsim,poisson=lam[i]) for i=0L,nlam-1L do scg[i,*]=randomn(seed,nsim)*sqrt(lam[i])+lam[i] for i=0,nlam-1L do dct[i]=mean(sct[i,*]-scg[i,*])/(lam[i]) plot,lam,dct,/yl,yticklen=1,ygrid=1 oplot,lam,1./sqrt(lam)

Borel Cantelli Lemma for the Gaussian World

hlee — Wed, 03 Dec 2008 04:31:29 +0000

Almost two year long scrutinizing some publications by astronomers gave me enough impression that astronomers live in the Gaussian world. You are likely to object this statement by saying that astronomers know and use Poisson, binomial, Pareto (power laws), Weibull, exponential, Laplace (Cauchy), Gamma, and some other distributions.^[1] This is true. I witness that these distributions are referred in many publications; however, when it comes to obtaining “BEST FIT estimates for the parameters of interest” and “their ERROR (BARS)”, suddenly everything goes back to the Gaussian world.^[2]

Borel Cantelli Lemma (from Planet Math): because of mathematical symbols, a link was made but any probability books have the lemma with proofs and descriptions.

I believe that I live in the RANDOM world. It is not necessarily always Gaussian but with large probability it looks like Gaussian thanks to Large Sample Theory. Here’s the question; “Do astronomers believe the Borel Cantelli Lemma (BCL) for their Gaussian world? And their bottom line of adopting Gaussian almost all occasions/experiments/data analysis is to prove this lemma for the Gaussian world?” Otherwise, one would like to be more cautious and would reason more before the chi-square goodness of fit methods are adopted. At least, I think that one should not claim that their chi-square methods are statistically rigorous, nor statistically sophisticated — for me, astronomically rigorous and sophisticated seems adequate, but no one say so. Probably, saying “statistically rigorous” is an effort of avoiding self praising and a helpless attribution to statistics. Truly, their data processing strategies are very elaborated and difficult to understand. I don’t see why under the name of statistics, astronomers praise their beautiful and complex data set and its analysis results. Often times, I stop for a breath to find an answer for why a simple chi-square goodness of fit method is claimed to be statistically rigorous while I only see the complexity of data handling given prior to the feed into the chi-square function.

The reason of my request for this one step backward prior to the chi-square method is that astronomer’s Gaussian world is only a part of multi-distributional universes, each of which has non negative probability measure.^[3] Despite the relatively large probability, the Gaussian world is just one realization from the set of distribution families. It is not an almost sure observation. Therefore, there is no need of diving into those chi-square fitting methods intrinsically assuming Gaussian, particularly when one knows exact data distributions like Poisson photon counts.

This ordeal of the chi-square method being called statistically rigorous gives me an impression that astronomers are under a mission of proving the grand challenge by providing as many their fitting results as possible based on the Gaussian assumption. This grand challenge is proving Borel-Cantelli Lemma empirically for the Gaussian world or in extension,

Based on the consensus that astronomical experiments and observations (A_i) occur in the Gaussian world and their frequency increase rapidly (i=1,…,n where n goes to infinity), for every experiment and observation (iid), by showing $$\sum_{i=1}^\infty P(A_i) =\infty,$$ the grand challenge that P(A_n, i.o.)=1 or the Gaussian world is almost always expected from any experiments/observations, can be proven.

Collecting as many results based on the chi-square methods is a sufficient condition for this lemma. I didn’t mean to ridicule but I did a bit of exaggeration by saying “the grand challenge.” By all means, I’m serious and like to know why astronomers are almost obsessed with the chi-square methods and the Gaussian world. I want to think plainly that adopting a chi-square method blindly is just a tradition, not a grand challenge to prove P(Gaussian_n i.o.)=1. Luckily, analyzing data in the Gaussian world hasn’t confronted catastrophic scientific fallacy. “So, why bother to think about a robust method applicable in any type of distributional world?”

Fortunately, I sometimes see astronomers who are not interested in this grand challenge of proving the Borel Cantelli Lemma for the Gaussian world. They provoke the traditional chi-square methods with limited resources – lack of proper examples and supports. Please, don’t get me wrong. Although I praise them, I’m not asking every astronomer to be these outsiders. Statisticians need jobs!!! Nevertheless, a paragraph and a diagnostic plot, i.e. a short justifying discussion for the chi-square is very much appreciated to convey the idea that the Gaussian world is the right choice for your data analysis.

Lastly, I’d like to raise some questions. “How confident are you that residuals between observations and the model are normally distribution only with a dozen of data points and measurement errors?” “Is the least square fitting is only way to find the best fit for your data analysis?” “When you know the data distribution is skewed, are you willing to use Δ χ₂ for estimating σ since it is the only way Numerical Recipe offers to estimate the σ?” I know that people working on their project for many months and years. Making an appointment with folks at the statistical consulting center of your institution and spending an hour or so won’t delay your project. Those consultants may or may not confirm that the strategies of chi-square or least square fitting is the best and convenient way. You may think statistical consulting is wasting time because those consultants do not understand your problems. Yet, your patience will pay off. Either in the Gaussian or non-Gaussian world, you are putting a correct middle stone to build a complete and long lasting tower. You already laid precious corner stones.

It is a bit disappointing fact that not many mention the t distribution, even though less than 30 observations are available.
To stay off this Gaussian world, some astronomers rely on Bayesian statistics and explicitly say that it is the only escape, which is sometimes true and sometimes not – I personally weigh more that Bayesians are not always more robust than frequentist methods as opposed to astronomers’ discussion about robust methods.
This non negativity is an assumption, not philosophically nor mathematically proven. My experience tells me the existence of Poissian world so that P(Poisson world)>0 and therefore, P(Gaussian world)<1 in reality.

Mexican Hat [EotW]

vlk — Wed, 28 May 2008 17:00:38 +0000

The most widely used tool for detecting sources in X-ray images, especially Chandra data, is the wavelet-based wavdetect, which uses the Mexican Hat (MH) wavelet. Now, the MH is not a very popular choice among wavelet aficianados because it does not form an orthonormal basis set (i.e., scale information is not well separated), and does not have compact support (i.e., the function extends to inifinity). So why is it used here?

The short answer is, it has a convenient background subtractor built in, is analytically comprehensible, and uses concepts very familiar to astronomers. The last bit can be seen by appealing to Gaussian smoothing. Astronomers are (or were) used to smoothing images with Gaussians, and in a manner of speaking, all astronomical images already come presmoothed by PSFs (point spread functions) that are nominally approximated by Gaussians. Now, if an image were smoothed by another Gaussian of a slightly larger width, the difference between the two smoothed images should highlight those features which are prominent at the spatial scale of the larger Gaussian. This is the basic rationale behind a wavelet.

So, in the following, G(x,y;σ_x,σ_y,x_o,y_o) is a 2D Gaussian written in such that the scaling of the widths and the transposition of the function is made obvious. It is defined over the real plane x,y ε R² and for widths σ_x,σ_y. The Mexican Hat wavelet MH(x,y;σ_x,σ_y,x_o,y_o) is generated as the difference between the two Gaussians of different widths, which essentially boils down to taking partial derivatives of G(σ_x,σ_y) wrt the widths. To be sure, these must really be thought of as operators where the functions are correlated with a data image, so the derivaties must be carried out inside an integral, but I am skipping all that for the sake of clarity. Also note, the MH is sometimes derived as the second derivative of G(x,y), the spatial derivatives that is.

The integral of the MH over R² results in the positive bump and the negative annulus canceling each other out, so there is no unambiguous way to set its normalization. And finally, the Fourier Transform shows which spatial scales (k_x,y are wavenumbers) are enhanced or filtered during a correlation.

Books – a boring title

hlee — Fri, 25 Jan 2008 16:53:21 +0000

I have been observing some sorts of misconception about statistics and statistical nomenclature evolution in astronomy, which I believe, are attributed to the lack of references in the astronomical society. There are some textbooks designed for junior/senior science and engineering students, which are likely unknown to astronomers. Example-wise, these books are not suitable, to my knowledge. Although I never expect astronomers to learn standard graduate (mathematical) statistics textbooks, I do wish astronomers go beyond Numerical Recipes (W. H. Press, S. A. Teukolsky, W. T. Vetterling, & B. P. Flannery) and Error Data Reduction and Analysis for the Physical Sciences (P. R. Bevington & D. K. Robinson). Here are some good ones written by astronomers, engineers, and statisticians:

The motivation of writing this posting was originated to Vinay’s recommendation: Practical Statistics for Astronomers (J.V.Wall and C.R.Jenkins), which provides many statistical insights and caveats that astronomers tend to ignore. Without looking at the error distribution and the properties of data, astronomers jump into chi-square and correlation. If someone reads the book, he/she will be careful on adopting statistics of common practice in astronomy, developed many decades ago, and founded on strong assumptions, not compatible with modern data sets. The book addresses many concerns that have been growing in my mind for astronomers and introduces various statistical methods applicable in astronomy.

The view points of astronomers without in-class statistics education but with full readership of this book, would be different from mine. The book mentioned unbiasedness, consistency, closedness, and robustness of statistics, which normally are not discussed nor proved in astronomy papers. Therefore, those readers may miss the insights, caveats, and contents-between-the-lines of the book, which I care about. To reduce such gap, as for quick and easy understanding of classical statistics, I recommend Cartoon Guide to Statistics (Larry Gonick, Woollcott Smith Business & Investing Collins) as a first step. This cartoon book enhances fundamentals in statistics only with fun and a friendly manner, and provides everything that rudimentary textbooks offer.

If someone wants to know beyond classical statistics (so called frequentist statistics) and likes to know popular Bayesian statistics, astronomy professor Phil Gregory’s Bayesian Logical Data Analysis for the Physical Sciences is recommended. If one likes to know little bit more on the modern statistics of frequentists and Bayesians, All of Statistics (Larry Wasserman) is recommended. I realize that textbooks for non-statistics students are too thick to go through in a short time (The book for senior engineering students at Penn State I used for teaching was Probability and Statistics for Engineering and the Sciences by Jay. L Devore, 4th and 5th edition and it was about 600 pages. The current edition is 736 pages). One of well received textbooks for graduate students in electrical engineering is Probability, Random Variables and Stochastic Processes (A. Papoulis & S.U. Pillai). I remember the book offers a rather less abstract definition of measure and practical examples (Personally, Hermite polynomials was useful from the book).

For a casual reading about statistics and its 20th century history, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (D. Salsburg) is quite nice.

Statistics is not just for best fit analysis and error bars. It is a wonderful telescope extracts correct information when it is operated carefully to the right target by the manual. It gets rid of atmospheric and other blurring factors when statistics is understood righteously. It is not a black box nor a magic, as many people think.

The era of treating everything gaussian is over decades ago. Because of the central limit theorem and the delta method (a good example is log-transformation), many statistics asymptotically follows the normal (gaussian) distribution but there are various families of distributions. Because of possible bias in the chi-square method, the error bar cannot guarantee the appointed coverage, like 95%. There are also nonparametric statistics, known for robustness, whereas it may be less efficient than statistics of distribution family assumption. Yet, it does not require model assumption. Also, Bayesian statistics works wonderfully if correct information on priors, suitable likelihood models, and computing powers for hierarchical models and numerical integration are provided.

Before jumping into the chi-square for fitting and testing at the same time, to prevent introducing bias, exploratory data analysis is required for better understanding data and for seeking a suitable statistic and its assumptions. The exploratory data analysis starts from simple scatter plots and box plots. A little statistical care for data and good interests in the truth of statistical methods are all I am asking for. I do wish that these books could assist the realization of my wishes.

—————————————————————————-
[1.] Most of links to books are from amazon.com but there is no personal affiliation to the company.

[2.] In addition to the previous posting on chi-square, what is so special about chi square in astronomy, I’d like to mention possible bias in chi-square fitting and testing. It is well known that utilizing the same data set for fitting, which results in parameter estimates so called in astronomy best fit values and error bars, and testing based on these parameter estimates brings out bias so that the best fit is biased from the true parameter value and the error bar does not match the aimed coverage. See the problem from Aneta’s an example of chi2 bias in fitting x-ray spectra

[3.] More book recommendation is welcome.