Archive for the ‘Fitting’ Category.

Signal Processing and Bootstrap

Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.

IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing

This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »

An example of chi2 bias in fitting the X-ray spectra.

The chi2 bias can affect the results of the X-ray spectral fitting and it
can be demonstrated in a simple way. The described simulations can be done
in Sherpa or XSPEC, the two software packages that allow for simulating the X-ray
spectra using a function called “fakeit”.

Here I assume an absorbed power law model with the sets of 3 parameters
(absorption column, photon index, and normalization) to simulate Chandra X-ray
spectrum given the instrument calibration files (RMF/ARF) and the Poisson noise.
The resulting simulated X-ray spectrum contains the model predicted counts with
the Poisson noise. This spectrum is then fit with the absorbed power law model to get
the best fit parameter values for NH, photon index and normalization.

I simulate 1000 spectra and fit each of them using different statistics: chi2 data variance,
chi2 model variance and Cash/C-statistics.

The next step is to plot the simulated distributions of the parameters and compare them
to the assumed values for the simulations. The figure shows the distribution of the photon
index parameter obtain from the fit of the spectra generated for the assumed simulated value
of 1.267. The chi2 bias is evident in this analysis, while the
CSTAT and Cash statistics based on the likelihood behave well. chi2 model variance
underestimates the simulated value, chi2 data variance overestimates this parameter.

 

Distributions of parameter values based on fitting the simulated X-ray data.

The plot shows the distribution of photon index parameters obtained by
fitting the simulated X-ray spectra with about 60000 counts and using the
three different statistics: chi2 with the model variance, chi2 with
data variance and C-statistics (Cash). The assumed value in the
simulations 1.267 is marked with the solid line.

[ArXiv] Bimodal Color Distribution in GCS, Sept. 7, 2007

From arxiv/astro-ph:0709.1073v1
On the Metallicity-Color Relations and Bimodal Color Distributions in Extragalactic Globular Cluster Systems by M. Cantiello and J. P. Blakeslee

Many observations on globular cluster systems (GCS) show bimodal distributions in color and metallicity space. The authors discussed the complication of non-linear metalicity and color relations and presented their careful study to suggest the optimal color(s) for revealing the presence of real bimodal GC metallicity distributions. Based on their simulation study, (V-H) and (V-K) are confirmed to be good colors for revealing unbiased bimodal metallicity distributions in GCS.
Continue reading ‘[ArXiv] Bimodal Color Distribution in GCS, Sept. 7, 2007’ »

[ArXiv] Swift and XMM measurement errors, Sep. 8, 2007

From arxiv/astro-ph:0708.1208v1:
The measurement errors in the Swift-UVOT and XMM-OM by N.P.M. Kuin and S.R. Rosen

The probability distribution of photon counts from the Optical Monitor on XMM Newton satellite (XMM-OM) and the UVOT on the Swift satellite follows a binomial distribution due to detector characteristics. Incident count rate was derived as a function of the measured count rate, which was shown to follow a binomial distribution.
Continue reading ‘[ArXiv] Swift and XMM measurement errors, Sep. 8, 2007’ »

[ArXiv] Identifiability and mixtures of distributions, Aug. 3, 2007

From arxiv/math.st: 0708.0499v1
Inference for mixtures of symmetric distributions by Hunter, Wang, and Hettmansperger, Annals of Statistics, 2007, Vol.35(1), pp.224-251.
Continue reading ‘[ArXiv] Identifiability and mixtures of distributions, Aug. 3, 2007’ »

[ArXiv] Numerical CMD analysis, Aug. 28th, 2007

From arxiv/astro-ph:0708.3758v1
Numerical Color-Magnitude Diagram Analysis of SDSS Data and Application to the New Milky Way Satellites by J. T. A. de Jong et. al.

The authors applied MATCH (Dolphin, 2002[1] -note that the year is corrected) to M13, M15, M92, NGC2419, NGC6229, and Pal14 (well known globular clusters), and BooI, BooII, CvnI, CVnII, Com, Her, LeoIV, LeoT, Segu1, UMaI, UMaII and Wil1 (newly discovered Milky Way satellites) from Sloan Digital Sky Survey (SDSS) to fit Color Magnitude diagrams (CMDs) of these stellar clusters and find the properties of these satellites.
Continue reading ‘[ArXiv] Numerical CMD analysis, Aug. 28th, 2007’ »

  1. Numerical methods of star formation history measurement and applications to seven dwarf spheroidals,Dolphin (2002), MNRAS, 332, p. 91[]

[ArXiv] Isochrone database, Aug. 20, 2007

From arxiv/astro-ph:0708.1204v3
An Isochrone Database and a Rapid Model for Stellar Population Synthesis by Li and Han

This paper emphasize the binary population: CMD fitting with the binary population synthetic model outperformed to the single population model. They used Hurley code (Hurley, Tout, and Pols (2002). Evolution of binary stars and the effect of tides on binary populations, MNRAS, 329(4), p.897-928). They mentioned that two color-color grids can disentangle the age-metallicity degeneracy via binary stellar populations. They fitted their isochrone database to M67 and NGC 1868 with the gT-grid and concluded that the database of binary stellar populations fitted the color magnitude diagrams better.
Continue reading ‘[ArXiv] Isochrone database, Aug. 20, 2007’ »

[ArXiv] Data-Driven Goodness-of-Fit Tests, Aug. 1, 2007

From arxiv/math.st:0708.0169v1
Data-Driven Goodness-of-Fit Tests by L. Mikhail

Goodness-of-Fit tests have been essential in astronomy to validate the chosen physical model to observed data whereas the limits of these tests have not been taken into consideration carefully when observed data were put into the model for estimating the model parameters. Therefore, I thought this paper would be helpful to have a thought on the different point of views between the astronomers’ practice of goodness-of-fit tests and the statisticians’ constructing tests. (Warning: the paper is abstract and theoretical.)
Continue reading ‘[ArXiv] Data-Driven Goodness-of-Fit Tests, Aug. 1, 2007’ »

Astrostatistics: Goodness-of-Fit and All That!

During the International X-ray Summer School, as a project presentation, I tried to explain the inadequate practice of χ^2 statistics in astronomy. If your best fit is biased (any misidentification of a model easily causes such bias), do not use χ^2 statistics to get 1σ error for the 68% chance of capturing the true parameter.

Later, I decided to do further investigation on that subject and this paper came along: Astrostatistics: Goodness-of-Fit and All That! by Babu and Feigelson.
Continue reading ‘Astrostatistics: Goodness-of-Fit and All That!’ »

[Quote] Model Skeptics

From IMS Bulletin Vol. 36(3), p.11, Terence’s Stuff: Model skeptics

[Once I quoted an article by Prof. Terry Speed in IMS Bulletin: Data-Doctors. Reading his columns in the IMS Bulletin provides me an opportunity to reflect who I am as a statistician and some guidance for treating data. Although his ideas were not from astronomy or astronomical data analysis, I often find his thoughts and words can be shared with astronomers.]
Continue reading ‘[Quote] Model Skeptics’ »

[ArXiv] Geneva-Copenhagen Survey, July 13, 2007

From arxiv/astro-ph:0707.1891v1
The Geneva-Copenhagen Survey of the Solar neighborhood II. New uvby calibrations and rediscussion of stellar ages, the G dwarf problem, age-metalicity diagram, and heating mechanisms of the disk by Holmberg, Nordstrom, and Andersen

Researchers, including scientists from CHASC, working on color magnitude diagrams to infer ages, metalicities, temperatures, and other physical quantities of stars and stellar clusters may find this paper useful.
Continue reading ‘[ArXiv] Geneva-Copenhagen Survey, July 13, 2007’ »

[ArXiv] Spectroscopic Survey, June 29, 2007

From arXiv/astro-ph:0706.4484

Spectroscopic Surveys: Present by Yip. C. overviews recent spectroscopic sky surveys and spectral analysis techniques toward Virtual Observatories (VO). In addition that spectroscopic redshift measures increase like Moore’s law, the surveys tend to go deeper and aim completeness. Mainly elliptical galaxy formation has been studied due to more abundance compared to spirals and the galactic bimodality in color-color or color-magnitude diagrams is the result of the gas-rich mergers by blue mergers forming the red sequence. Principal component analysis has incorporated ratios of emission line-strengths for classifying Type-II AGN and star forming galaxies. Lyα identifies high z quasars and other spectral patterns over z reveal the history of the early universe and the characteristics of quasars. Also, the recent discovery of 10 satellites to the Milky Way is mentioned.
Continue reading ‘[ArXiv] Spectroscopic Survey, June 29, 2007’ »

Everything you wanted to know about power-laws but were afraid to ask

Clauset, Shalizi, & Newman (2007, arXiv/0706.1062) have a very detailed description of what power-law distributions are, how to recognize them, how to fit them, etc. They are also making available their matlab and R codes that they use to do the fitting and such.

Looks like a very handy reference text, though I am a bit uncertain about their use of the K-S test to check whether a dataset can be described with a power-law or not. It is probably fine; perhaps some statisticians would care to comment?

All your bias are belong to us

Leccardi & Molendi (2007) have a paper in A&A (astro-ph/0705.4199) discussing the biases in parameter estimation when spectral fitting is confronted with low counts data. Not surprisingly, they find that the bias is higher for lower counts, for standard chisq compared to C-stat, for grouped data compared to ungrouped. Peter Freeman talked about something like this at the 2003 X-ray Astronomy School at Wallops Island (pdf1, pdf2), and no doubt part of the problem also has to do with the (un)reliability of the fitting process when the chisq surface gets complicated.

Anyway, they propose an empirical method to reduce the bias by computing the probability distribution functions (pdfs) for various simulations, and then averaging the pdfs in groups of 3. Seems to work, for reasons that escape me completely.

[Update: links to Peter's slides corrected]

On the unreliability of fitting

Despite some recent significant advances in Statistics and its applications to Astronomy (Cash 1976, Cash 1979, Gehrels 1984, Schmitt 1985, Isobe et al. 1986, van Dyk et al. 2001, Protassov et al. 2002, etc.), there still exist numerous problems and limitations in the standard statistical methodologies that are routinely applied to astrophysical data. For instance, the basic algorithms used in non-linear curve-fitting in spectra and images have remained unchanged since the 1960′s: the downhill simplex method of Nelder & Mead (1965) modified by Powell, and methods of steepest descent exemplified by Levenberg-Marquardt (Marquardt 1963). All non-linear curve-fitting programs currently in general use (Sherpa, XSPEC, MPFIT, PINTofALE, etc.) with the exception of Monte Carlo and MCMC methods are implementations based on these algorithms and thus share their limitations.
Continue reading ‘On the unreliability of fitting’ »