Archive for the ‘arXiv’ Category.

[ArXiv] Decision Tree, Aug. 31, 2007

From arxiv/astro-ph:0708.4274v1
Comparison of decision tree methods for finding active objects by Y. Zhao and Y. Zhang

The authors (astronomers) introduced and summarized various decision three methods (REPTree, Random Tree, Decision Stump, Random Forest, J48, NBTree, and AdTree) to the astronomical community.
Continue reading ‘[ArXiv] Decision Tree, Aug. 31, 2007’ »

[ArXiv] Numerical CMD analysis, Aug. 28th, 2007

From arxiv/astro-ph:0708.3758v1
Numerical Color-Magnitude Diagram Analysis of SDSS Data and Application to the New Milky Way Satellites by J. T. A. de Jong et. al.

The authors applied MATCH (Dolphin, 2002[1] -note that the year is corrected) to M13, M15, M92, NGC2419, NGC6229, and Pal14 (well known globular clusters), and BooI, BooII, CvnI, CVnII, Com, Her, LeoIV, LeoT, Segu1, UMaI, UMaII and Wil1 (newly discovered Milky Way satellites) from Sloan Digital Sky Survey (SDSS) to fit Color Magnitude diagrams (CMDs) of these stellar clusters and find the properties of these satellites.
Continue reading ‘[ArXiv] Numerical CMD analysis, Aug. 28th, 2007’ »

  1. Numerical methods of star formation history measurement and applications to seven dwarf spheroidals,Dolphin (2002), MNRAS, 332, p. 91[]

[ArXiv] Isochrone database, Aug. 20, 2007

From arxiv/astro-ph:0708.1204v3
An Isochrone Database and a Rapid Model for Stellar Population Synthesis by Li and Han

This paper emphasize the binary population: CMD fitting with the binary population synthetic model outperformed to the single population model. They used Hurley code (Hurley, Tout, and Pols (2002). Evolution of binary stars and the effect of tides on binary populations, MNRAS, 329(4), p.897-928). They mentioned that two color-color grids can disentangle the age-metallicity degeneracy via binary stellar populations. They fitted their isochrone database to M67 and NGC 1868 with the gT-grid and concluded that the database of binary stellar populations fitted the color magnitude diagrams better.
Continue reading ‘[ArXiv] Isochrone database, Aug. 20, 2007’ »

Cross-validation for model selection

One of the most frequently cited papers in model selection would be An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion by M. Stone, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, No. 1 (1977), pp. 44-47.
(Akaike’s 1974 paper, introducing Akaike Information Criterion (AIC), is the most often cited paper in the subject of model selection).
Continue reading ‘Cross-validation for model selection’ »

An alternative to MCMC?

I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better? Continue reading ‘An alternative to MCMC?’ »

[ArXiv] Data-Driven Goodness-of-Fit Tests, Aug. 1, 2007

From arxiv/math.st:0708.0169v1
Data-Driven Goodness-of-Fit Tests by L. Mikhail

Goodness-of-Fit tests have been essential in astronomy to validate the chosen physical model to observed data whereas the limits of these tests have not been taken into consideration carefully when observed data were put into the model for estimating the model parameters. Therefore, I thought this paper would be helpful to have a thought on the different point of views between the astronomers’ practice of goodness-of-fit tests and the statisticians’ constructing tests. (Warning: the paper is abstract and theoretical.)
Continue reading ‘[ArXiv] Data-Driven Goodness-of-Fit Tests, Aug. 1, 2007’ »

[ArXiv] Poisson Mixture, Aug. 16, 2007

From arxiv/math.st:0708.2153v1
Estimating the number of classes by Mao and Lindsay

This study could be linked to identifying the number of lines from Poisson nature x-ray count data, one of the key interests for astronomers. However, as pointed by the authors, estimating the numbers of classes is a difficult statistical problem. I.J.Good[1] said that

I don’t believe it is usually possible to estimate the number of species, but only an appropriate lower bound to that number. This is because there is nearly always a good chance that there are a very large number of extremely rare species.

Continue reading ‘[ArXiv] Poisson Mixture, Aug. 16, 2007’ »

  1. courtesy of the paper: Estimating the number of species: A review by Bunge and Fitzpatrick (1993), JASA, 88, 364-373.[]

[ArXiv] Gamma-ray albedo of the moon, Aug. 15, 2007

From arxiv/astro-ph:0705.3856
Gamma-ray albedo of the moon by Moskalenko and Porter

The title sounds very interesting although the significance of albedo spectra is not recognized by a statistician. This study was performed to utilize GLAST and PAMELA via Monte Carlo simulations (the toolkit for MC was GEANT 8.2) with EGRET data.

Coverage issues in exponential families

I’ve been heard so much, without knowing fundamental reasons (most likely physics), about coverage problems from astrostat/phystat groups. This paper might be an interest for those: Interval Estimation in Exponential Families by Brown, Cai,and DasGupta ; Statistica Sinica (2003), 13, pp. 19-49

Abstract summary:
The authors investigated issues in interval estimation of the mean in the exponential family, such as binomial, Poisson, negative binomial, normal, gamma, and a sixth distribution. The poor performance of the Wald interval has been known not only for discrete cases but for nonnormal continuous cases with significant negative bias. Their computation suggested that the equal tailed Jeffreys interval and the likelihood ratio interval are the best alternatives to the Wald interval. Continue reading ‘Coverage issues in exponential families’ »

Astrostatistics: Goodness-of-Fit and All That!

During the International X-ray Summer School, as a project presentation, I tried to explain the inadequate practice of χ^2 statistics in astronomy. If your best fit is biased (any misidentification of a model easily causes such bias), do not use χ^2 statistics to get 1σ error for the 68% chance of capturing the true parameter.

Later, I decided to do further investigation on that subject and this paper came along: Astrostatistics: Goodness-of-Fit and All That! by Babu and Feigelson.
Continue reading ‘Astrostatistics: Goodness-of-Fit and All That!’ »

[ArXiv] GRB host galaxies, Aug. 10, 2007

From arxiv/astro-ph:0708.1510v1
Connecting GRBs and galaxies: the probability of chance coincidence by Cobb and Bailyn

Without an optical afterglow, a galaxy within the 2 arc second error region of a GRB x-ray afterglow is identified as a host galaxy; however confusion can rise due to the facts that 1. the edge of a galaxy is diffused, 2. multiple sources could exist within 2 arc second error region, 3.the distance between the galaxy and the x-ray afterglow is measured by projection, and 4. lensing causes increase of brightness and position shifts. In this paper, the authors “investigated the fields of 72 GRBs in order to examine the general issue of associations between GRBs and host galaxies.”
Continue reading ‘[ArXiv] GRB host galaxies, Aug. 10, 2007’ »

[Quote] Changing my mind (again)

From IMS Bulletin Vol. 36(7) p.10, Terence’s Stuff: Changing my mind (again)
Continue reading ‘[Quote] Changing my mind (again)’ »

[Quote] Model Skeptics

From IMS Bulletin Vol. 36(3), p.11, Terence’s Stuff: Model skeptics

[Once I quoted an article by Prof. Terry Speed in IMS Bulletin: Data-Doctors. Reading his columns in the IMS Bulletin provides me an opportunity to reflect who I am as a statistician and some guidance for treating data. Although his ideas were not from astronomy or astronomical data analysis, I often find his thoughts and words can be shared with astronomers.]
Continue reading ‘[Quote] Model Skeptics’ »

[ArXiv] Geneva-Copenhagen Survey, July 13, 2007

From arxiv/astro-ph:0707.1891v1
The Geneva-Copenhagen Survey of the Solar neighborhood II. New uvby calibrations and rediscussion of stellar ages, the G dwarf problem, age-metalicity diagram, and heating mechanisms of the disk by Holmberg, Nordstrom, and Andersen

Researchers, including scientists from CHASC, working on color magnitude diagrams to infer ages, metalicities, temperatures, and other physical quantities of stars and stellar clusters may find this paper useful.
Continue reading ‘[ArXiv] Geneva-Copenhagen Survey, July 13, 2007’ »

[ArXiv] SDSS DR6, July 23, 2007

From arxiv/astro-ph:0707.3413
The Sixth Data Release of the Sloan Digital Sky Survey by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and
SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.
Continue reading ‘[ArXiv] SDSS DR6, July 23, 2007’ »