The AstroStat Slog

Q: Lowess error bars?

Jun 3rd, 2008| 02:53 am | Posted by vlk

It is somewhat surprising that astronomers haven’t cottoned on to Lowess curves yet. That’s probably a good thing because I think people already indulge in smoothing far too much for their own good, and Lowess makes for a very powerful hammer. But the fact that it is semi-parametric and is based on polynomial least-squares fitting does make it rather attractive.

And, of course, sometimes it is unavoidable, or so I told Brad W. When one has too many points for a regular polynomial fit, and they are too scattered for a spline, and too few to try a wavelet “denoising”, and no real theoretical expectation of any particular model function, and all one wants is “a smooth curve, damnit”, then Lowess is just the ticket.

Well, almost.

There is one major problem — how does one figure what the error bounds are on the “best-fit” Lowess curve? Clearly, each fit at each point can produce an estimate of the error, but simply collecting the separate errors is not the right thing to do because they would all be correlated. I know how to propagate Gaussian errors in boxcar smoothing a histogram, but this is a whole new level of complexity. Does anyone know if there is software that can calculate reliable error bands on the smooth curve? We will take any kind of error model — Gaussian, Poisson, even the (local) variances in the data themselves.

Tags: Brad Wargelin, error bands, error bars, Fitting, least-squares, Loess, Lowess, polynomial, question for statisticians, smoothing
Category: Algorithms, Fitting, Methods, Stat, Uncertainty | 11 Comments

[ArXiv] 4th week, May 2008

May 31st, 2008| 11:59 pm | Posted by hlee

Eight astro-ph papers and two statistics paper are listed this week. One statistics paper discusses detecting filaments and the other talks about maximum likelihood estimation of satellite images (clouds). …Continue reading»

Tags: AGN, Bayes factor, bootstrap, confidence set, cosmological constanct, dark energy, Exofit, exoplanet, filament, jackknife, KDE, Model Selection, time series, Type Ia SNe, unbiased, wavelet
Category: arXiv, Bayesian, MCMC, Stat | Comment

Mexican Hat [EotW]

May 28th, 2008| 01:00 pm | Posted by vlk

The most widely used tool for detecting sources in X-ray images, especially Chandra data, is the wavelet-based wavdetect, which uses the Mexican Hat (MH) wavelet. Now, the MH is not a very popular choice among wavelet aficianados because it does not form an orthonormal basis set (i.e., scale information is not well separated), and does not have compact support (i.e., the function extends to inifinity). So why is it used here?
…Continue reading»

Tags: Chandra, ciao, convolution, correlation, EotW, Equation, Equation of the Week, Fourier Transform, gaussian, MexHat, Mexican Hat, MH, multiscale, wavdetect, wavelet
Category: Algorithms, Astro, Imaging, Jargon | 1 Comment

[ArXiv] 3rd week, May 2008

May 26th, 2008| 02:59 pm | Posted by hlee

Not many this week, but there’s a great read. …Continue reading»

Tags: clustering, high dimension, LF, maximum likelihood, multivariate, Poisson, Schechter, zero count
Category: arXiv, Bayesian, Fitting, MCMC, Methods, Stat | Comment

This week’s quote:

May 23rd, 2008| 10:00 pm | Posted by chasc

“It’s easy to get a good fit, which means that your fit doesn’t mean much…”

Ariane Lancon (from proceedings of “Starbursts: from 30 Doradus to Lyman break galaxies”, 2005)

Category: Quotes | Comment

Background Subtraction [EotW]

May 21st, 2008| 01:00 pm | Posted by vlk

There is a lesson that statisticians, especially of the Bayesian persuasion, have been hammering into our skulls for ages: do not subtract background. Nevertheless, old habits die hard, and old codes die harder. Such is the case with X-ray aperture photometry. …Continue reading»

Tags: aperture photometry, background, background marginalization, background subtraction, celldetect, Chandra, ciao, EotW, Equation, error propagation, ldetect, local detect, wavdetect, X-ray
Category: Algorithms, Astro, Jargon | 6 Comments

Did they, or didn’t they?

May 20th, 2008| 12:10 am | Posted by vlk

Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways.. …Continue reading»

Tags: arXiv, Chandra, CXC, Optical, Peter Edmonds, positional coincidence, positional error, Power, progenitor, question for statisticians, significance, Supernova, Type Ia, White Dwarf, White Dwarf binary, X-ray
Category: arXiv, Astro, Data Processing, News, Objects, Optical, Stat, Uncertainty | 5 Comments

[ArXiv] 2nd week, May 2008

May 19th, 2008| 10:42 am | Posted by hlee

There’s no particular opening remark this week. Only I have profound curiosity about jackknife tests in [astro-ph:0805.1994]. Including this paper, a few deserve separate discussions from a statistical point of view that shall be posted. …Continue reading»

Tags: bimodality, bootstrap, calibration uncertainty, CF, Classification, CMB, dip, exoplanet, Fisher matrix, flare, GL, jackknife, KS test, marked point, maximum likelihood, MLE, poisson point process, spatial data, XLF
Category: arXiv, Frequentist, Uncertainty, X-ray | Comment

Line Emission [EotW]

May 14th, 2008| 01:00 pm | Posted by vlk

Spectral lines are a ubiquitous feature of astronomical data. This week, we explore the special case of optically thin emission from low-density and high-temperature plasma, and consider the component factors that determine the line intensity. …Continue reading»

Tags: abundance, emission, emission measure, emissivity, EotW, Equation, Equation of the Week, flux, ion balance, line
Category: Astro, Jargon, Misc, Physics, Stars | 2 Comments

A Data Miner’s Story

May 14th, 2008| 01:20 am | Posted by hlee

Usama Fayyad (click the image to listen the lecture)

A Data Miner’s Story – Getting to Know the Grand Challenges
…Continue reading»

Tags: data mining, KDD, Usama Fayyad, Yahoo
Category: Misc | Comment

R-[{Perl,Python}] Interface

May 13th, 2008| 03:47 pm | Posted by hlee

The brackets could be filled with other languages but two are introduced today: Perl (perl.org) and Python (python.org). These two are widely used among astronomers and can be empowered by R (r-project.org). …Continue reading»

Tags: interface, Perl, Python, R
Category: Languages | 1 Comment

[ArXiv] 1st week, May 2008

May 11th, 2008| 10:42 pm | Posted by hlee

I think I have to review spatial statistics in astronomy, focusing on tessellation (void structure), point process (expanding 2 (3) point correlation function), and marked point process (spatial distribution of hardness ratios of X-ray distant sources, different types of galaxies -not only morphological differences but other marks such as absolute magnitudes and existence of particular features). When? Someday…

In addition to Bayesian methodologies, like this week’s astro-ph, studies on characterizing empirical spatial distributions of voids and galaxies frequently appear, which I believe can be enriched further with the ideas from stochastic geometry and spatial statistics. Click for what was appeared in arXiv this week. …Continue reading»

Tags: Classification, covariance, FARIMA, Fisher information, GL, GRB, Levy, light curve, limb darkening, ML, Pareto distribution, quasars, solar flare, standard candle, tessellation, time series, VO, void
Category: arXiv, MCMC, Uncertainty | 1 Comment

gamma function (Equation of the Week)

May 6th, 2008| 06:12 pm | Posted by vlk

The gamma function [not the Gamma -- note upper-case G -- which is related to the factorial] is one of those insanely useful functions that after one finds out about it, one wonders “why haven’t we been using this all the time?” It is defined only on the ~~positive~~ non-negative real line, is a highly flexible function that can emulate almost any kind of skewness in a distribution, and is a perfect complement to the Poisson likelihood. In fact, it is the conjugate prior to the Poisson likelihood, and is therefore a natural choice for a prior in all cases that start off with counts. …Continue reading»

Tags: conjugate, EotW, Equation, gamma, Poisson, prior
Category: Misc, Stat | 8 Comments

[ArXiv] 5th week, Apr. 2008

May 5th, 2008| 03:08 am | Posted by hlee

Since I learned Hubble’s tuning fork^[1] for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. …Continue reading»

Wikipedia link: Hubble sequence[↩]

Tags: ANN, automation, Classification, correlation function, denoising, FFT, gravitational wave, lensing, LISA, machine learning, missing data, mock data, morphology, PCA, power spectrum, robust, SDSS, spectrum, sunspots, wavelet, zoo
Category: arXiv, Galaxies, Imaging, MCMC, Physics, Spectral | Comment

Equation of the Week: Confronting data with model

May 2nd, 2008| 06:06 pm | Posted by vlk

Starting a new feature — highlighting some equation that is widely used in astrophysics or astrostatistics. Today’s featured equation: what instruments do to incident photons. …Continue reading»

Tags: ARF, effective area, EotW, Equation, Equation of the Week, LRF, point spread function, PSF, response matrix, RMF, source
Category: Astro, Jargon | Comment