The AstroStat Slog » test

[SPS] Testing Completeness

hlee — Wed, 19 Nov 2008 05:34:59 +0000

There will be a special session at the 213th AAS meeting on meaning from surveys and population studies (SPS). Until then, it might be useful to pull out some interesting and relevant papers and questions/challenges as a preliminary to the meeting. I will not list astronomical catalogs and surveys only, which are literally countless these days but will bring out some if they change the way how science is performed with a description of the catalog (the best example would be SDSS, Sloan Digital Sky Survey, to my knowledge).

The main focus of the series of postings (I’m not sure how many there will be. There are chances that [SPS] series might be terminated after this season) is introducing some statistical challenges including managing data, opt to be spawn from astronomical surveys and population studies. My paper selection criterion is based on the group discussions from the SPS working group during SAMSI astrostatistics program in 2006 (group leaders were G. Babu, Director of CASt and T. Loredo).

Completeness – I. Revised, reviewed and revived by Johnston, Teodoro, and Hendry
MNRAS, 376(4), pp. 1757-176
Abstract (abridged to the first paragraph) We have extended and improved the statistical test recently developed by Rauzy for assessing the completeness in apparent magnitude of magnitude-redshift surveys. Our improved test statistic retains the robust properties – specifically independence of the spatial distribution of galaxies within a survey – of the Tc statistic introduced in Rauzy’s seminal paper, but now accounts for the presence of both a faint and bright apparent magnitude limit. We demonstrate that a failure to include a bright magnitude limit can significantly affect the performance of Rauzy’s Tc statistic. Moreover, we have also introduced a new test statistic, Tv, defined in terms of the cumulative distance distribution of galaxies within a redshift survey. These test statistics represent powerful tools for identifying and characterizing systematic errors in magnitude-redshift data.

One of the authors was an active participant of the SPS working group at SAMSI. The following three quotes pertain statistically genuine content-wise although the paper was published in MNRAS.

It is straightforward to show from this definition that the random variable η has a uniform distribution on the interval [0,1], and furthermore that η and Z are statistically independent.

If the sample is complete in apparent magnitude, for a given pair of trial magnitude limits, then Tc should be normally distributed with mean zero and variance unity. If, on the other hand, the trial faint (bright) magnitude limit is fainter (brighter) than the true limit, Tc will become systematically negative, due to the systematic departure of the $$\hat{\eta}_i$$ distribution from uniform on the interval [0,1].

If the sample is complete in apparent magnitude, for a given pair of trail magnitude limits, then Tv should be normally distributed with mean zero, and variance unity. If, on the other hand, the trail faint (bright)magnitude limit is fainter (brighter) than the true limit, in either case Tv will become systematically negative, due to the systematic departure of the $$\hat{\tau}_i$$ distribution from uniform on the interval [0,1].

Their statistics is utilized as a diagnostic tool such that the estimate of statistics becomes an indicator of completeness at a given magnitude. Otherwise, asymptotic studies could have been exercised in depth so that people who use their statistics (Tc and Tv) could obtain p-values (for hypothesis testing) and confidence intervals. The authors, however, computed the means and variances and stated that these statistics are standard normal without no rigorous proofs. On the other hand, the process of estimating Tc and Tv statistics is nonparametric so that further statistical inference such as showing that asymptotically Tc and Tv are normal, can be very challenging unless strong assumptions on (probabilistic) models and/or priors are given. Overall, these statistics are more statistically appealing to me in terms of testing completeness compared to other ratio based methods.

Testing completeness now seems not a difficult task due to these statistics, extensive survey catalogs, and better understanding of populations. However, still uncertainties in k-correction, e-correction, and extinction correction make their statistics fuzzy and difficult to interpret results. Changes in statistics due to these uncertainties are hard to be characterized. Furthermore, obtaining good (point) estimators for these correction terms still remains as almost unconquered.

In addition to testing completeness described in the above paper, regarding incompleteness, I’ve seen modeling efforts basically based on the power law, whose slope parameter is an indicator of cosmological models from x-ray astronomy. Unfortunately, incompleteness makes the slope estimation process complex and lots of efforts are found in searching/estimating a model reflecting this incompleteness in observations as a function of redshifts or magnitudes; otherwise, it is fitting a simple ordinary linear regression model with a complete data set.

I believe someday incompleteness will be stochastically modeled (parameterized to draw information and to offer good prediction) beyond testing and will offer better understanding of the visible universe (visible here is a very broad concept, not indicating something only can be seen through naked human eyes). For a while, (in)completeness has been a concept and a word of meaning to which mathematical compactness and statistical modeling has never been attached to test and to understand uncertainties.

p.s. I have been paying lots of attention on citation style; in contrast, you’ve noticed my citations are far from consistency. Two noticeable differences between citation styles of statistics and astronomy are abbreviation of journal names and inclusion of titles. Astronomers’ citation is compact, concise, and same across astronomical journals; on the contrary, statisticians’ citation is lengthy, informative (because of title), and various across statistical and applied statistics journals. MNRAS reminded me something that from a paper written by a very renowned statistician referred a paper from MNRAS but said Monograph National Royal Astronomical Society. I think now you become gracious to my citation style.

[disclaimer] I saw various population studies in astronomy from a broad wavelength range, each of which has different objectives, targets, obstacles, and study designs (even telescopes, detectors, data pipelines, and sampling schemes are different), and (in)completeness studies are designed to reflect those differences. I’m afraid that I’m only reporting a tiny fraction of all efforts related to (in)completeness. Your comments are most welcome. Also, I wish for your posts and comments regarding (in)completeness, volume/magnitude limited sample, survey studies, upper limits, missing values in survey, clustering, spatial distribution, large scale structure, etc in the near future.

A test for global maximum

hlee — Wed, 02 Jul 2008 02:10:09 +0000

If getting the first derivative (score function) and the second derivative (empirical Fisher information) of a (pseudo) likelihood function is feasible and checking regularity conditions is viable, a test for global maximum (Li and Jiang, JASA, 1999, Vol. 94, pp. 847-854) seems to be a useful reference for verifying the best fit solution.

I didn’t see any ways to confirm that best fit results from XSPEC or Sherpa are a global maximum without searching whole parameter space. My little understanding tells that many fitting algorithms do not guarantee a global maximum. By checking that the best fit solution is the global maximum and subsequently, the obtained error bar is expected to have the nominal coverage, we could save efforts of searching whole parameter space.

my first AAS. IV. clustering

hlee — Fri, 20 Jun 2008 03:42:06 +0000

I was questioned by two attendees, acquainted before the AAS, if I can suggest them clustering methods relevant to their projects. After all, we spent quite a time to clarify the term clustering.

The statistician’s and astronomer’s understanding of clustering is different:
- classification vs. clustering or supervised learning vs. unsupervised learning: the former terms from the pairs indicate the fact that the scientist already knows types of objects in his hands. A photometry data set with an additional column saying star, galaxy, quasar, and unknown is a target for classification or supervised learning. Simply put, classification is finding a rule with photometric colors that could classify these different type objects. If there’s no additional column but the scatter plots or plots after dimension reduction manifesting grouping patterns, it is clustering or unsupervised learning whose goal is finding hyperplanes to separates these clusters optimally; in other words, answering these questions, are there real clusters? If so, how many? is the objective of clustering/unsupervised learning. Overall, rudimentarily, the presence of an extra column of types differentiates between classification and clustering.
- physical clustering vs. statistical clustering:
  Cosmologists and alike are interested in clusters/clumps of matters/particles/objects. For astrophysicists, clusters are associated with spatial evolution of the universe. Inquiries related to clustering from astronomers are more likely related to finding these spatial clumps statistically, which is a subject of stochastic geometry or spatial statistics. On the other hand, statisticians and data analysts like to investigate clusters in a reparameterized multi-dimensional space. Distances computed do not follow the fundamental laws of physics (gravitation, EM, weak, and strong) but reflect relationships in the multi-dimensional space; for example, in a CM diagram, stars of a kind are grouped. The consensus between two communities about clustering is that the number of clusters is unknown, where the plethora of classification methods cannot be applied and that the study objectives are seeking methodologies for quantifying clusters .
astronomer’s clustering problems are either statistical classification (closed to semi-supervised learning) or spatial statistics.
The way of manifesting noisy clusters in the universe or quantifying the current status of matter distribution leads to the very fundamentals of the birth of the universe, where spatial statistics can be a great partner. In the era of photometric redshifts, various classification techniques enhances the accuracy of prediction.
astronomer’s testing the reality of clusters seems limited: Cosmology problems have been tackled as inverse problem. Based on theoretical cosmology models, simulations are performed and the results are transformed into some surrogate parameters. These surrogates are generally represented by some smooth curves or straight lines in a plot where observations made their debut as points with bidirectional error bars (so called measurement errors). The judgment about the cosmological model under the test happens by a simple regression (correlation) or eyes on these observed data points. If observations and a curve from a cosmological model presented in a 2D plot match well, the given cosmological model is confirmed in the conclusion section. Personally, this procedure of testing cosmological models to account for clusters of the universe can be developed in a more statistically rigorous fashion instead of matching straight lines.
Challenges to statisticians in astronomy, measurement errors: In (statistical) learning, I believe, there has been no standard procedure to account for astronomers’ measurement errors into modeling. I think measurements errors are, in general, ignored because systematics errors are not recognized in statistics. On the other hand, in astronomy, measurement errors accompanying data, are a very crucial piece of information, particularly for verifying the significance of the observations. Often this measurement errors became denominator in the χ² function which is treated as a χ² distribution to get best fits and confidence intervals.

Personal lessons from two short discussions at the AAS were more collaboration between statisticians and astronomers to include measurement errors in classification or semi-supervised learning particularly for nowadays when we are enjoying plethora of data sets and moving forward with a better aid from statisticians for testing/verifying the existence of clusters beyond fitting a straight line.

[ArXiv] 2nd week, Jan. 2007

hlee — Fri, 11 Jan 2008 19:44:44 +0000

It is notable that there’s an astronomy paper contains AIC, BIC, and Bayesian evidence in the title. The topic of the paper, unexceptionally, is cosmology like other astronomy papers discussed these (statistical) information criteria (I only found a couple of papers on model selection applied to astronomical data analysis without articulating CMB stuffs. Note that I exclude Bayes factor for the model selection purpose).

To find the paper or other interesting ones, click

[astro-ph:0801.0638]
AIC, BIC, Bayesian evidence and a notion on simplicity of cosmological model M Szydlowski & A. Kurek
[astro-ph:0801.0642]
Correlation of CMB with large-scale structure: I. ISW Tomography and Cosmological Implications S. Ho et.al.
[astro-ph:0801.0780]
The Distance of GRB is Independent from the Redshift F. Song
[astro-ph:0801.1081]
A robust statistical estimation of the basic parameters of single stellar populations. I. Method X. Hernandez and D. Valls–Gabaud
[astro-ph:0801.1106]
A Catalog of Local E+A(post-starburst) Galaxies selected from the Sloan Digital Sky Survey Data Release 5 T. Goto (Carefully built catalogs are wonderful sources for classification/supervised learning, or semi-supervised learning)
[astro-ph:0801.1358]
A test of the Poincare dodecahedral space topology hypothesis with the WMAP CMB data B.S. Lew & B.F. Roukema

In cosmology, a few candidate models to be chosen, are generally nested. A larger model usually is with extra terms than smaller ones. How to define the penalty for the extra terms will lead to a different choice of model selection criteria. However, astronomy papers in general never discuss the consistency or statistical optimality of these selection criteria; most likely Monte Carlo simulations and extensive comparison across those criteria. Nonetheless, my personal thought is that the field of model selection should be encouraged to astronomers to prevent fallacies of blindly fitting models which might be irrelevant to the information that the data set contains. Physics tells a correct model but data do the same.

What is so special about chi square in astronomy?

hlee — Thu, 12 Jul 2007 04:02:39 +0000

Since I start reading arxiv/astro-ph abstracts and a few relevant papers about a month ago, so often I see chi-square something as an optimization or statistical inference tool. Chi-square function, chi-square statistics, chi-square goodness-of-fit test are the words that serve different data analysis purposes but under the same prefix. As a newbie to statistics, although I learned chi-square distribution and chi-square test, doing statistics with chi-square are somewhat considered to be obsolete in terms of robust applications to modern data. These are introduced as one of many distributions and statistical tests. Nothing special. However, in astronomy, chi-square becomes the almost only method for statistical data analysis. I wonder how such strong bond between chi-square tactics and astronomer’s keen mind to data analysis has happened?

Beyond this historic question, one thing more bothers me is mixing chi-square function with chi-square distribution. The former is not necessarily chi-square distributed but it is practiced that once chi-square function is written, the variable within the function will have a confidence interval automatically according to chi-square distribution with degrees-of-freedom. No checking procedure for regularity conditions.

Statistically and astronomically, answers to my question lead to correcting my knowledge and erasing my prejudice. Vinay wrote about chi-square fitting. This certainly gives a better account for my question. Or Numerical Recipes to follow how chi-square methods are used. I welcome all kind lessons, advice, and references to have extended knowledge and a better perspective about the meaning of chi-square to astronomers.