The AstroStat Slog

Archive for the ‘Methods’ Category.

[ArXiv] 2nd week, Apr. 2008

Apr 11th, 2008| 02:21 am | Posted by hlee

Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions. Continue reading ‘[ArXiv] 2nd week, Apr. 2008’ »

Tags: Classification, GRB, Hubble constant, K-S test, kurtosis, mask, maximum likelihood, SDSS, skewness, Solar Oscillation, Vicent Martinez
Category: arXiv, Bayesian, MCMC, Methods, Stat | 3 Comments

Astrometry.net

Mar 12th, 2008| 03:32 pm | Posted by hlee

Astrometry.net, a cool website I heard from Harvard Astronomy Professor Doug Finkbeiner’s class (Principles of Astronomical Measurements), does a complex job of matching your images of unknown locations or coordinates to sources in catalogs. By providing your images in various formats, they provide astrometric calibration meta-data and lists of known objects falling inside the field of view. Continue reading ‘Astrometry.net’ »

Tags: Astrometry, Doug Finkbeiner
Category: Algorithms, Astro, Data Processing, Fitting, Imaging, Methods, Objects, Stat, Uncertainty | Comment

[ArXiv] A fast Bayesian object detection

Mar 5th, 2008| 04:46 pm | Posted by hlee

This is a quite long paper that I separated from [Arvix] 4th week, Feb. 2008:
[astro-ph:0802.3916] P. Carvalho, G. Rocha, & M.P.Hobso
A fast Bayesian approach to discrete object detection in astronomical datasets – PowellSnakes I
As the title suggests, it describes Bayesian source detection and provides me a chance to learn the foundation of source detection in astronomy. Continue reading ‘[ArXiv] A fast Bayesian object detection’ »

Tags: Bayesian evidence, coloured background, CRLB, decision theory, filter, Fisher informatoin, likelihood, PowellSnake, prior, simulated annealing, SNR, source detection, state space, Sunyaev-Zel'dovich effect, symmetric loss, templates
Category: Algorithms, arXiv, Bayesian, Cross-Cultural, Data Processing, Fitting, Frequentist, MCMC, Methods, Objects | Comment

Non-nested hypothesis tests

Feb 19th, 2008| 10:15 pm | Posted by hlee

I was reading [1]. I must say that I do not know Bayesian methods to cope with model misspecification, tests with an unknown true model, or tests for non-nested hypotheses except Bayes factor (concerns a lot how to choose priors). Nonetheless, the zeal among economists to test non-nested models might assist astronomers to move forward beyond testing nested hypotheses with F statistic. Continue reading ‘Non-nested hypothesis tests’ »

Tags: hypothesis testing, maximum likelihood, model misspecification, non-nested models, unknown truth
Category: Frequentist, Methods, Stat | Comment

Signal Processing and Bootstrap

Jan 30th, 2008| 02:33 am | Posted by hlee

Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.

IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing

This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »

Tags: bootstrap, compressive sensing, confidence interval, GLM, IEEE, jacknife, machine learning, multitaper estimate, particle filter, signal processing, statistical inference, Tutorial, wavelet
Category: Algorithms, arXiv, Bayesian, Cross-Cultural, Fitting, Frequentist, MC, MCMC, Methods, Misc, Spectral, Stat, Uncertainty | Comment

[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007

Sep 12th, 2007| 04:31 pm | Posted by hlee

From arxiv/astro-ph:0709.1359,
A robust morphological classification of high-redshift galaxies using support vector machines on seeing limited images. I Method description by M. Huertas-Company et al.

Machine learning and statistical learning become more and more popular in astronomy. Artificial Neural Network (ANN) and Support Vector Machine (SVM) are hardly missed when classifying on massive survey data is the objective. The authors provide a gentle tutorial on SVM for galactic morphological classification. Their source code GALSVM is linked for the interested readers.
Continue reading ‘[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007’ »

Tags: Classification, learning, morphology, SVM
Category: Algorithms, arXiv, Astro, Galaxies, Methods | Comment

[ArXiv] Recent bayesian studies from astro-ph

Sep 11th, 2007| 03:38 am | Posted by hlee

In the past month, I’ve noticed relatively frequent paper appearance in arxiv/astro-ph whose title includes Bayesian or Markov Chain Monte Carlo (MCMC). Those papers are:

[astro-ph:0709.1058v1] Joint Bayesian Component Separation and CMB Power Spectrum Estimation by H.K.Eriksen et. al.
[astro-ph:0709.1104v1] Monolithic or hierarchical star formation? A new statistical analysis by M. Kampakoglou, R. Trotta, and J. Silk
[astro-ph:0411573v2] A Bayesian analysis of the primordial power spectrum by M.Bridges, A.N.Lasenby, M.P.Hobson
[astro-ph:0709.0596v1] Bayesian inversion of Stokes profiles by A. A. Ramos, M.J.M. Gonzales, and J.A. Rubino-Martin
[astro-ph:0709.0711v1] Bayesian posterior classification of planetary nebulae according to the Peimbert types by C. Quireza, H.J.Rocha-Pinto, and W.J. Maciel
[astro-ph:0708.2340v1] Bayesian Galaxy Shape Measurement for Weak Lensing Surveys -I. Methodology and a Fast Fitting Algorithm by L. Miller et. al.
[astro-ph:0708.1871v1] Dark energy and cosmic curvature: Monte-Carlo Markov Chain approach by Y. Gong et. al.

Continue reading ‘[ArXiv] Recent bayesian studies from astro-ph’ »

Category: arXiv, Bayesian, MCMC, Methods, Stat | Comment

[ArXiv] Swift and XMM measurement errors, Sep. 8, 2007

Sep 11th, 2007| 01:12 am | Posted by hlee

From arxiv/astro-ph:0708.1208v1:
The measurement errors in the Swift-UVOT and XMM-OM by N.P.M. Kuin and S.R. Rosen

The probability distribution of photon counts from the Optical Monitor on XMM Newton satellite (XMM-OM) and the UVOT on the Swift satellite follows a binomial distribution due to detector characteristics. Incident count rate was derived as a function of the measured count rate, which was shown to follow a binomial distribution.
Continue reading ‘[ArXiv] Swift and XMM measurement errors, Sep. 8, 2007’ »

Tags: binomial, measurement error, photon count, Swift, UVOT, XMM
Category: arXiv, Astro, Data Processing, Fitting, Methods, Uncertainty | 1 Comment

Quote of the Week, Aug 23, 2007

Aug 23rd, 2007| 11:08 pm | Posted by aconnors

These are from two lively CHASC discussions on classification, or cluster analysis. The first was on Feb 7, 2006; the continuation on Dec 12, 2006, at the Harvard Statistics Department, as part of Stat 310 .

David van Dyk:

Don’t demand too much of the classes. You’re not going to say that all events can be well-classified…. It’s more descriptive. It gives you places to look. Then you look at your classes.

Xiao Li Meng:

Then you’re saying the cluster analysis is more like -

David van Dyk:

It’s really like you have a propsal for classes. You then investigate the physical processes more thoroughly. You may have classes that divide it [up]

……

David van Dyk:

But it can make a difference, where you see the clusters, depending on your [parameter] transformation.You can squish the white spaces, and stretch out the crowded spaces; so it can change where you think the clusters are.

Aneta Siemignowska:

But that is interesting.

Andreas Zezas:

Yes, that is very interesting.

These are particularly in honor of Hyunsook Lee‘s recent posting of Chattopadhyay et. al.’s new work about possible intrinsic classes of gamma-ray bursts. Are they really physical classes — or do they only appear to be distinct clusters because we view them through the “squished” lens (parameter spaces) of our imperfect instruments?

Category: Cross-Cultural, Data Processing, Methods, Quotes, Stat | 3 Comments

Cross-validation for model selection

Aug 19th, 2007| 11:35 pm | Posted by hlee

One of the most frequently cited papers in model selection would be An Asymptotic Equivalence of Choice of Model by Cross-Validation and Akaike’s Criterion by M. Stone, Journal of the Royal Statistical Society. Series B (Methodological), Vol. 39, No. 1 (1977), pp. 44-47.
(Akaike’s 1974 paper, introducing Akaike Information Criterion (AIC), is the most often cited paper in the subject of model selection).
Continue reading ‘Cross-validation for model selection’ »

Tags: AIC, Cash statistics, cross-validation, exponential family, Fisher information, maximum likelihood, Model Selection, resampling, score, TIC
Category: Algorithms, arXiv, Frequentist, Methods, Stat | 5 Comments

An alternative to MCMC?

Aug 19th, 2007| 12:31 am | Posted by vlk

I think of Markov-Chain Monte Carlo (MCMC) as a kind of directed staggering about, a random walk with a goal. (Sort of like driving in Boston.) It is conceptually simple to grasp as a way to explore the posterior probability distribution of the parameters of interest by sampling only where it is worth sampling from. Thus, a major savings from brute force Monte Carlo, and far more robust than downhill fitting programs. It also gives you the error bar on the parameter for free. What could be better? Continue reading ‘An alternative to MCMC?’ »

Tags: arXiv, likelihood, MCMC, Nested Sampling, posterior, probability, Skilling
Category: arXiv, MC, MCMC, Methods | 3 Comments

Astrostatistics: Goodness-of-Fit and All That!

Aug 14th, 2007| 10:17 pm | Posted by hlee

During the International X-ray Summer School, as a project presentation, I tried to explain the inadequate practice of χ^2 statistics in astronomy. If your best fit is biased (any misidentification of a model easily causes such bias), do not use χ^2 statistics to get 1σ error for the 68% chance of capturing the true parameter.

Later, I decided to do further investigation on that subject and this paper came along: Astrostatistics: Goodness-of-Fit and All That! by Babu and Feigelson.
Continue reading ‘Astrostatistics: Goodness-of-Fit and All That!’ »

Tags: Anderson-Darling, Babu, best-fit, bias, bootstrap, chi-square, Cramer-von Mises, Feigelson, Kolmogorov-Smirnoff, Kullback-Leibler distance, nonparametric, parametric, resampling
Category: Algorithms, arXiv, Astro, Fitting, High-Energy, Methods, Spectral, Stat | 7 Comments

Quote of the Week, July 12, 2007

Jul 12th, 2007| 03:37 pm | Posted by aconnors

Ingrid Daubechies, color gif from her website This is from the very interesting Ingrid Daubechies interview by Dorian Devins,
www.nasonline.org/interviews_daubechies, National Academy of Sciences, U.S.A., 2004. It is from part 6, where Ingrid Daubechies speaks of her early mathematics paper on wavelets. She tries to put the impact into context:

I really explained in the paper where things came from. Because, well, the mathematicians wouldn’t have known. I mean, to them this would have been a question that really came out of nowhere. So, I had to explain it …

I was very happy with [the paper]; I had no inkling that it would take off like that… [Of course] the wavelets themselves are used. I mean, more than even that. I explained in the paper how I came to that. I explained both [a] mathematicians way of looking at it and then to some extent the applications way of looking at it. And I think engineers who read that had been emphasizing a lot the use of Fourier transforms. And I had been looking at the spatial domain. It generated a different way of considering this type of construction. I think, that was the major impact. Because then other constructions were made as well. But I looked at it differently. A change of paradigm. Well, paradigm, I never know what that means. A change of … a way of seeing it. A way of paying attention.

Category: Data Processing, Frequentist, Imaging, Methods, Misc, Quotes, Stat, Timing | Comment

Quote of the Week, July 5, 2007

Jul 5th, 2007| 04:13 pm | Posted by aconnors

Jeff Scargle (in person [top] and in wavelet transform [bottom], left) weighs in on our continuing discussion on how well “automated fitting”/”Machine Learning” can really work (private communication, June 28, 2007):

It is clearly wrong to say that automated fitting of models to data is impossible. Such a view ignores progress made in the area of machine learning and data mining. Of course there can be problems, I believe mostly connected with two related issues:

* Models that are too fragile (that is, easily broken by unusual data)
* Unusual data (that is, data that lie in some sense outside the arena that one expects)

The antidotes are:
(1) careful study of model sensitivity
(2) if the context warrants, preprocessing to remove “bad” points
(3) lots and lots of trial and error experiments, with both data sets that are as realistic as possible and ones that have extremes (outliers, large errors, errors with unusual properties, etc.)
Trial … error … fix error … retry …

You can quote me on that.

This ilustration is from Jeff Scargle’s First GLAST Symposium (June 2007) talk, pg 14, demonstrating the use of inverse area of Voroni tesselations, weighted by the PSF density, as an automated measure of the density of Poisson Gamma-Ray counts on the sky.

Category: Algorithms, Astro, Data Processing, gamma-ray, High-Energy, Imaging, Methods, Quotes, Stat, Timing, X-ray | 1 Comment

[ArXiv] Spectroscopic Survey, June 29, 2007

Jul 2nd, 2007| 06:07 pm | Posted by hlee

From arXiv/astro-ph:0706.4484

Spectroscopic Surveys: Present by Yip. C. overviews recent spectroscopic sky surveys and spectral analysis techniques toward Virtual Observatories (VO). In addition that spectroscopic redshift measures increase like Moore’s law, the surveys tend to go deeper and aim completeness. Mainly elliptical galaxy formation has been studied due to more abundance compared to spirals and the galactic bimodality in color-color or color-magnitude diagrams is the result of the gas-rich mergers by blue mergers forming the red sequence. Principal component analysis has incorporated ratios of emission line-strengths for classifying Type-II AGN and star forming galaxies. Lyα identifies high z quasars and other spectral patterns over z reveal the history of the early universe and the characteristics of quasars. Also, the recent discovery of 10 satellites to the Milky Way is mentioned.
Continue reading ‘[ArXiv] Spectroscopic Survey, June 29, 2007’ »

Tags: bimodality, chi-square minimization, Classification, CMD, Estimation, machine learning, massive data, model based, PCA, spectral analysis, spectroscopic, survey, VO
Category: arXiv, Astro, Bayesian, Data Processing, Fitting, Frequentist, Methods, Spectral | Comment

The AstroStat Slog

[ArXiv] 2nd week, Apr. 2008

Astrometry.net

[ArXiv] A fast Bayesian object detection

Non-nested hypothesis tests

Signal Processing and Bootstrap

[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007

[ArXiv] Recent bayesian studies from astro-ph

[ArXiv] Swift and XMM measurement errors, Sep. 8, 2007

Quote of the Week, Aug 23, 2007

Cross-validation for model selection

An alternative to MCMC?

Astrostatistics: Goodness-of-Fit and All That!

Quote of the Week, July 12, 2007

Quote of the Week, July 5, 2007

[ArXiv] Spectroscopic Survey, June 29, 2007

Admin

Recent Posts

Recent Comments

Category Cloud

Blogroll

Links