Archive for July 2008

keV vs keV [Eqn]

I have noticed that our statistician collaborators are often confused by our units. (Not a surprise; I, too, am constantly confused by our units.) One of the biggest culprits is the unit of energy, [keV], Continue reading ‘keV vs keV [Eqn]’ »

loess and lowess and locfit, oh my

Diab Jerius follows up on LOESS techniques with a very nice summary update and finds LOCFIT to be very useful, but there are still questions about how it deals with measurement errors and combining observations from different experiments:

Continue reading ‘loess and lowess and locfit, oh my’ »

The Banff Challenge [Eqn]

With the LHC coming on line anon, it is appropriate to highlight the Banff Challenge, which was designed as a way to figure out how to place bounds on the mass of the Higgs boson. The equations that were to be solved are quite general, and are in fact the first attempt that I know of where calibration data are directly and explicitly included in the analysis. Continue reading ‘The Banff Challenge [Eqn]’ »

SLAC Summer Institute

A GLAST-related opportunity: A Summer Science Institute at SLAC on Cosmic Accelerators is scheduled for August 4-15 in anticipation of GLAST science, and the co-directors welcome participation by students, postdocs, and researchers (even those with no background in astrophysics). The registration deadline is July 31. Continue reading ‘SLAC Summer Institute’ »

chi-square distribution [Eqn]

The Χ2 distribution plays an incredibly important role in astronomical data analysis, but it is pretty much a black box to most astronomers. How many people know, for instance, that its form is exactly the same as the γ distribution? A Χ2 distribution with ν degrees of freedom is

p(z|ν) = (1/Γ(ν/2)) (1/2)ν/2 zν/2-1 e-z/2 ≡ γ(z;ν/2,1/2) , where z=Χ2.

Continue reading ‘chi-square distribution [Eqn]’ »

Reduced and Processed Data

Hyunsook recently said that she wished that there were “some astronomical data depositories where no data reduction is required but one can apply various statistical analyses to the data in the depository to learn and compare statistical methods”. With the caveat that there really is no such thing (every dataset will require case specific reduction; standard processing and reduction are inadequate in all but the simplest of cases), here is a brief list: Continue reading ‘Reduced and Processed Data’ »

Kaplan-Meier Estimator (Equation of the Week)

The Kaplan-Meier (K-M) estimator is the non-parametric maximum likelihood estimator of the survival probability of items in a sample. “Survival” here is a historical holdover because this method was first developed to estimate patient survival chances in medicine, but in general it can be thought of as a form of cumulative probability. It is of great importance in astronomy because so much of our data are limited and this estimator provides an excellent way to estimate the fraction of objects that may be below (or above) certain flux levels. The application of K-M to astronomy was explored in depth in the mid-80′s by Jurgen Schmitt (1985, ApJ, 293, 178), Feigelson & Nelson (1985, ApJ 293, 192), and Isobe, Feigelson, & Nelson (1986, ApJ 306, 490). [See also Hyunsook's primer.] It has been coded up and is available for use as part of the ASURV package. Continue reading ‘Kaplan-Meier Estimator (Equation of the Week)’ »

Survival Analysis: A Primer

Astronomers confront with various censored and truncated data. Often these types of data are called after famous scientists who generalized them, like Eddington bias. When these censored or truncated data become the subject of study in statistics, instead of naming them, statisticians try to model them so that the uncertainty can be quantified. This area is called survival analysis. If your library has The American Statistician subscription and you are an astronomer handles censored or truncated data sets, this primer would be useful for briefly conceptualizing statistics jargon in survival analysis and for characterizing uncertainties residing in your data. Continue reading ‘Survival Analysis: A Primer’ »

Poisson Likelihood [Equation of the Week]

Astrophysics, especially high-energy astrophysics, is all about counting photons. And this, it is said, naturally leads to all our data being generated by a Poisson process. True enough, but most astronomers don’t know exactly how it works out, so this derivation is for them. Continue reading ‘Poisson Likelihood [Equation of the Week]’ »

A test for global maximum

If getting the first derivative (score function) and the second derivative (empirical Fisher information) of a (pseudo) likelihood function is feasible and checking regularity conditions is viable, a test for global maximum (Li and Jiang, JASA, 1999, Vol. 94, pp. 847-854) seems to be a useful reference for verifying the best fit solution. Continue reading ‘A test for global maximum’ »