The AstroStat Slog » Power

how to trace?

hlee — Thu, 11 Jun 2009 20:52:57 +0000

I was at the SUSY 09 public lecture given by a Nobel laureate, Frank Wilczek of QCD (quantum chromodynamics). As far as I know SUSY is the abbreviation of SUperSYmetricity in particle physics. Finding such antimatter(? I’m afraid I read “Angels and Demons” too quickly) will explain the unification theory among electromagnetic, weak, and strong forces and even the gravitation according to the speaker’s graph. I’ll not go into the details of particle physics and the standard model. The reason is too obvious. Instead, I’d like to show this image from wikipedia and to discuss my related questions.

Whenever LHC (Large Hardron Collider, several posts from the slog) is publicly advertised, the grand scale of accelerator (26km) is the center of attention for these unprecedented controlled experiments for particle physics researches. Controlled in conjunction of factorization in statistical experiment designs to eliminate unknowns and to factor in external components (covariates, for example). By the same token, not the grand scale of the accelerator, but the detector and controlled/isolated system, and its designs for collecting data seem most important to me. Without searching for reports, I want to believe that many countless efforts have been put into detectors and data processors, which seem to be overshadowed because of the grand scale of the accelerating tube.

For fun and honoring the speaker’s showing it to the public, you might like to see this youtube rap again.

As a statistician, curious about the detector and the physics leading the designs of such expensive and extreme studies, I was more interested in knowing further on

how data are collected and
how study was designed or what are the hypotheses

not the scale of the accelerator nor the feeling inside the 2 degree vacuum tube. There was no clue to find out partial answers to these questions through the public lecture. So, I hope some slog readers could help me understand better the following issues spawn from this public lecture. Let me talk my questions statistically and try to associate them with the image above.

Uncertainty Principle

The uncertainty principle by physicists is written roughly as follows:

Δ E Δ t > h

where h is Plank’s constant. Instead of energy and time, Δ x Δ p > h, location and momentum is used as well. This principle is more or less related to precision or bias. One cannot measure things with 100% precision. In other word, in measuring quantities from physics, there is no exact unbiased estimator (asymptotically unbiased is a different context). In order to observe subparticle in a short time scale, the energy must be high. Yet, unless the energy is extremely high, the uncertainty of when the event happen is huge so that no one can assign exact numerals when the eveny happen. This uncertainty principle is the primary reason for such large accelerator so that particle can gain tremendous energy and therefore, an observer can determine the location and the time of the event (collision, subsequent annihilation, and scatters of subparticles) with uncertainty from the principle of physics.

What is Uncertainty?

I’ve always had a conflicting notion about uncertainty in statistics and astronomy. The uncertainty from the Heigenberg’s uncertainty principle and the uncertainty from measure theory and the stochastic nature of data. Although the word is same but the implications are different. The former describes precision as discussed above and the later accuracy (Bevington’s book describes the difference between precision and accuracy, if I recall correctly). When an astronomer has data and computes a best fit and one σ from the chi-square, that σ is quantifying the uncertainty/scale of the Gaussian distribution, a model for residuals that the astronomer has chosen for fitting the data with the model of physics.

When it comes to measurement errors it’s more like discussing precision, not accuracy or the scale parameters of distribution functions (family of distributions). Either measurement errors, or computing uncertainty via chi-square minimization or Bayesian posterior distribution estimation, most of procedures to understand uncertainties in astronomical literature is based on parametrizating uncertainties. Luckily we know that Gaussian and Poisson distribution for parametrization works almost all cases in astronomy. Yet, my understanding is that there’s not much distinction between precision and accuracy in astronomical data analysis, not much awareness about the difference between the uncertainty principle from physics and the uncertainty by the stochastic nature of data. This seems causing biased or underestimated results. With jargon of statistics, instead of overlooking, the issues of model mis-identification and model uncertainty^[1] of other disciplines are worth to be looked into to narrow the gap.

As a statistician, I approach the problem of uncertainty hierarchically. Start from the simplest that sigma is known and used the given sigma as the ground truth. If statistics does not advocate such condition, then move to a direction of estimating it, and testing whether it is homogenous or heterogeneous error, etc to understand the sampling distribution better and device statistics accordingly. During the procedure, I’ll add a model for measurement errors. If Gaussian, adding statistical uncertainty terms and measurement error terms works well, an easy convolution of Gaussian distributions (see my why gaussianity?). I might have to ignore some factors in my hierarchical modeling procedure if their contribution is almost none but the hierarchical model becomes too complicated for such mediocre gain. Instead, it would be easy to follow the rule of thumb strategy developed by astronomers with great knowledge and experience. Anyway, if parametric strategy does not work, I’ll employ nonparametric approaches. Focusing on Bayesian methodology, it’s like modeling hierarchically from parametric likelihoods and informative, subjective priors to nonparametric likelihoods, objective, noninformative priors. Overall, these are efforts of modeling both physics and errors assuming that measurements are taken accurately; multiple measurements and collecting many photons quantifies how accurately the best fit is obtained. On the other hand, under the uncertainty principle, intrinsic measurement bias (unknown but bounded) is inevitable. Not statistics but physics could tell how precise measurements can be taken. Still it’s uncertainty but different kind. I sometimes confront astronomers mixing strategies of calibrating the uncertainties of different grounds and also I got confused and lost.

I’d like to say that multiple observations (the amount of degrees of freedom in chi-square minimizations, and bins in histograms) are realizations of coupling of bias and variance (precision and accuracy; measurement errors and statistical uncertainty in sigma/error bar) from which the importance of proper parametrization and regularized optimization is never enough to be emphasized to get that right 68% coverage of the uncertainty in a best fit, instead of simple least square or chi-square. Statisticians often discuss the mean square error (see my post [MADS] Law of Total Variance) than the error bar to account for the overall uncertainty in a best fit.

I’m afraid that my words sound gibberish – I hope that statisticians with good commands of literal and scientific languages discuss the uncertainty of physics and of statistics and how it affects choosing statistical methods and drawing statistical inference from (astro)physical data. I’m also afraid that people continue going for one sigma by feeding the data into the chi-square function and adding speculated systematic errors (say 15% of the computed sigma from the chi-square minimization) without second thoughts on the implications of uncertainty and on assumptions for its quantification methods.

Identifiability

I wonder how the shot of above image is taken when protons are colliding. There should be a tremendous number of subparticles generated from the collisions of many protons. Unless there is a single photo frame that takes traces of all those particles (collision happens in 3D camera chamber? Perhaps, they use medical imaging, tomography techniques but processing time wise I doubt its feasibility), I think those traces are the reconstruction of multiple cross sectional shots. My biggest concern was how each line and dot you see from the picture can be associated to a certain particle. Physics and standard model can tell that their trajectories are distinguishable, depending on their charges, types, and mass but there are, say, millions of events happening in the matter of extremely short time scale! How certain one can say this is the trace of a certain particle.

The speaker discussed massive data and uncertainty as another challenge. So many procedures in terms of (statistical) data analysis seem not explored yet although theory of physics is very sophisticated and complicated. If physics is an deductive/deterministic science, then statistics is inductive/stochastic. I personally believe that theories are able to conclude the same from both physical and statistical experiments. I guess now it’s time to prove such thesis with data and statistics and it starts with identifying particles’ traces and their meta-data.

image reconstruction

To create an image of many particles as above when we have the identifiability issue and the uncertainty in time and space, I wonder how pictures are constructed from each collision. The lecturer used an analogy of a dodecaheron calendar with missing months to deliver the feeling of image reconstruction in particle physics. Whenever I see such images of many ray traces and hear promises that LHC will deliver, I’ve been wondering how they reconstruct those traces after the particle collisions and measuring times of events. Thanks to the uncertainty principle and its mediocre scale, there must be some tremendous constraints and missings. How much information is contained in that reconstructed image? How much information loss is inevitable due to those constraints. It would be very interesting to know each step from detectors to images and find statistical and information theoretical challenges.

massive data processing

Colliding one proton to the other seems ideal to discover the unification theory advocating the standard model by tracing individual relatively small number of particles. If so, the picture above could have been simpler than what it looks. Unfortunately, it’s not the case and huge number of protons are sent for collision. I’ve kept heard the gigantic size of data that particle physics experiments create. I wondered how such massive data are processed while the speaker showed the picture of one of world best computing facilities at CERN. Not just for automated pipelining but for processing, cleaning, summarizing, and evaluating from statistical aspects would require clever algorithms to make most of those multiple processors.

hypothesis testing

I still think that quests for searching particles via LHC are classical decision theoretic hypothesis testing problem: the null hypothesis is no new(unobserved particle) vs. the alternative hypothesis contains the model/information of new particle by the theory (SUSY, antimatter etc). Statistically speaking, in order to observe such matter or to reject null hypothesis comfortable, we need statistically powerful tests, where Neyman-Pearson test/construction is often mentioned. One needs to design an experiment that is powerful enough (power here has two connotations: one is physically powerful enough to make proton have high energy so that one can observe particles in the brief time and space frame, and the other is statistically powerful such that if such new particles exist, the test is powerful enough to reject the null hypothesis with decent power and false discovery rate). How to transcribe data and models into a powerful test seems still an open question to physicists. You can check discussions from the links in the PHYSTAT LHC 2007 post.

source detection

In the similar context of source detection in astronomy, how do physicists define and segregate source (particle of interest, higgs, for example) from background? It’s also related to identifiability of particles shown in the picture. How can a physicist see an rare event among tons of background events which form a wide sampling distribution or in other words, that have a huge uncertainty as an ensemble. Also the source event has its own uncertainty because of the uncertainty principle. How to form robust thresholding methods? How to develop Bayesian learning strategies for better detection? Perhaps the underlying (statistical) models are different for particle physics and for astronomy, but the basic idea of how to apply statistical inference seem not much different from the fact that 1. background can be more dominant, 2. background is used for the null hypothesis, and 3. the source distribution comprise the alternative distribution. It’ll be very interesting to collect statistics for source detection and formalize those methods so that consistent source detection results can be achieved by devising statistics suits the data types.

cliche and irony

I’d like to quote two phrases from the public lecture.

finding an atom in a haystack

A cliche for all groups of scientists. I’ve heard “finding a needle in a haystack” so many times because of the new challenge that we confronted from the information era. On the other hand, replacing “needle” to “atom” was new to me. Unfortunately, my impression is that physicists are not equipped with tools to do such data mining either a needle or an atom. I wonder what computer scientists can offer them for this more challenging quest to answer the fundamental question about the universe.

it’s an exciting time to be physicists

The speaker used physicists but I’ve heard the same sentence from astronomers and statisticians with their professions replaced with physicists. After hearing it too often from various people, I became doubtful since I cannot feel such excitements imminently. It feels like after hearing change too often before it happens, one cannot feel the real progress of changes. Words always travel faster than actions. Sometimes words can be just empty promise. That’s why I thought it’s a cliche and irony. Perhaps it’s due to that fact I’m at the intersection of the combination of these scientist sets, not at the center of any set. Ironically, defining boundaries is also fuzzy nowadays. Perhaps, I’m already excited and afraid of transiting down to a lower energy level. Anyway, being enthusiastic and living in an exciting time seems different matter.

What will come next?

I haven’t heard the news about Phystat 2009, whose previous meetings occurred every odd year in the 21st century. Personally, their meeting agenda and subsequent proceedings were very informative and offered clues to my questions. I hope the next meeting soon to be held.

the notions of model uncertainty among astronomers and statisticians are different. Hopefully, I have time to talk about it

Did they, or didn’t they?

vlk — Tue, 20 May 2008 04:10:23 +0000

Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways..

The first time around, Voss & Nelemans (arXiv:0802.2082) looked at how many X-ray sources there were around the candidate progenitor of SN2007on (they also looked at 4 more galaxies that hosted Type Ia SNe and that had X-ray data taken prior to the event, but didn’t find any other candidates), and estimated the probability of chance coincidence with the optical position. When you expect 2.2 X-ray sources/arcmin² near the optical source, the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3%. This result has since been reported in Nature.

However, Roelofs et al. (arXiv:0802.2097) went about getting better optical positions and doing better bore-sighting, and as a result, they measured the the X-ray position accurately and also carried out Monte Carlo simulations to estimate the error on the measured location. And they concluded that the actual separation, given the measurement error in the location, is too large to be a chance coincidence, 1.18±0.27 arcsec. The probability ~~that the two locations are the same~~ of finding offsets in the observed range is ~1% [see Tom's clarifying comment below].

Well now, ain’t that a nice pickle?

To recap: there are so few X-ray sources in the vicinity of the supernova that anything close to its optical position cannot be a coincidence, BUT, the measured error in the position of the X-ray source is not copacetic with the optical position. So the question for statisticians now: which argument do you believe? Or is there a way to reconcile these two calculations?

Oh, and just to complicate matters, the X-ray source that was present 4 years ago had disappeared when looked for in December, as one would expect if it was indeed the progenitor. But on the other hand, a lot of things can happen in 4 years, even with astronomical sources, so that doesn’t really confirm a physical link.

tests of fit for the Poisson distribution

hlee — Tue, 29 Apr 2008 06:24:09 +0000

Scheming arXiv:astro-ph abstracts almost an year never offered me an occasion that the fit of the Poisson distribution is tested in different ways, instead it is taken for granted by plugging data and (source) model into a (modified) χ² function. If any doubts on the Poisson distribution occur, the following paper might be useful:

J.J.Spinelli and M.A.Stephens (1997)
Cramer-von Mises tests of fit for the Poisson distribution
Canadian J. Stat. Vol. 25(2), pp. 257-267
Abstract: goodness-of-fit tests based on the Cramer-von Mises statistics are given for the Poisson distribution. Power comparisons show that these statistics, particularly A², give good overall tests of fit. The statistics A² will be particularly useful for detecting distributions where the variance is close to the mean, but which are not Poisson.

In addition to Cramer-von Mises statistics (A² and W²), the dispersion test D (so called a χ² statistic for testing the goodness of fit in astronomy and this D statistics is considered as a two sided test approximately distributed as a χ²_n-1 variable), the Neyman-Barton k-component smooth test S_k, P and T (statistics based on the probability generating function), and the Pearson X² statistics (the number of cells K is chosen to avoid small expected values and the statistics is compared to a χ²_K-1 variable, I think astronomers call it modified χ² test) are introduced and compared to compute the powers of these tests. The strategy to provide the powers of the Cramer-von Mises statistics is that there is a parameter γ in the negative binomial distribution, which is zero under the null hypothesis (Poission distribution), and letting this γ=δ/sqrt(n) in which the parameter value δ is chosen so that for a two-sided 0.05 level test, the best test has a power of 0.5^[1]. Based on this simulation study, the statistic A² was empirically as powerful as the best test compared to other Cramer-von Mises tests.

Under the Poission distribution null hypothesis, the alternatives are overdispersed, underdispersed, and equally dispersed distributions. For the equally dispersed alternative, the Cramer-von Mises statistics have the best power compared other statistics. Overall, the Cramer-von Mises statistics have good power against all classes of alternative distributions and the Pearson X² statistic performed very poorly for the overdispersed alternative.

Instead of binning for the modified χ² tests^[2], we could adopt A² of W² for the goodness-of-fit tests. Probably, it’s already implemented in softwares but not been recognized.

The locally most powerful unbiased test is the statistics D (Potthoff and Whittinghill, 1966)
authors’ examples indicate high significant levels compared to other tests; in other words, χ² statistics – the dispersion test statistic D and the Pearson X² – are insensitive to provide the evidence of the source model is not a good-fit to produce Poisson photon count data

The power of wavdetect

vlk — Sun, 21 Oct 2007 19:59:31 +0000

wavdetect is a wavelet-based source detection algorithm that is in wide use in X-ray data analysis, in particular to find sources in Chandra images. It came out of the Chicago “Beta Site” of the AXAF Science Center (what CXC used to be called before launch). Despite the fancy name, and the complicated mathematics and the devilish details, it is really not much more than a generalization of earlier local cell detect, where a local background is estimated around a putative source and the question is asked, is whatever signal that is being seen in this pixel significantly higher than expected? However, unlike previous methods that used a flux measurement as the criterion for detection (e.g., using signal-to-noise ratios as proxy for significance threshold), it tests the hypothesis that the observed signal can be obtained as a fluctuation from the background.

Because wavdetect is an extremely complex algorithm, it is not possible to determine its power easily, and large numbers of simulations must be done. In December 2001 (has it been that long?!), the ChaMP folks held a mini workshop, and I reported on my investigations into the efficiency of wavdetect, calculating the probability of detecting a source in various circumstances. I had been meaning to get it published, but this is not really publishable by itself, and furthermore, wavdetect has evolved sufficiently since then that the actual numbers are obsolete. So here, strictly for entertainment, are the slides from that long ago talk. As Xiao-li puts it, “nothing will go waste!”

(click on the image to go to the pdf)

[ArXiv] 2nd week, Oct. 2007

hlee — Fri, 12 Oct 2007 20:00:40 +0000

Frankly, there was no astrostatistically interesting paper from astro-ph this week but profitable papers from the statistics side were posted. For the list, click

[astro-ph:0710.1352]
Triggered Star Formation in the Region NGC 346/N 66 in the Small Magellanic Cloud by D. A. Gouliermis et.al.
[physics.data-an:0710.1068]
Information and Entropy by A. Caticha
[math.ST:0710.1688]
Adaptive estimation of linear functionals by model selection by B. Laurent, C. Prieur, and C. Ludeña
[stat.AP:0710.2024]
Ratios: A short guide to confidence limits and proper use by V.H.Franz
[stat.TH:0710.2245]
Size, Power, and False Discovery Rates by B. Efron.
[math.ST:0710.2258]
Control of generalized error rates in multiple testing by J. P. Romano and M. Wolf

In terms of detecting sources, my personal thought is that the topics of size, power, false discovery rates, and multiple testing ask for astronomers attention. To my understanding, the power of observation has increased but the methodology of source detection cites tools of two to three decades ago when assuming normality, iid, or homogeneity for data and instruments was agreeable.