The AstroStat Slog

Archive for September 2007

[ArXiv] 4th week, Sept. 2007

Sep 30th, 2007| 11:29 pm | Posted by hlee

A few papers from astro-ph may drag statisticians’ attention and a statistical paper may be helpful for astronomers, keen on confidence intervals utilizing prior information.
Continue reading ‘[ArXiv] 4th week, Sept. 2007’ »

Category: arXiv | 1 Comment

ab posteriori ad priori

Sep 29th, 2007| 06:03 pm | Posted by vlk

A great advantage of Bayesian analysis, they say, is the ability to propagate the posterior. That is, if we derive a posterior probability distribution function for a parameter using one dataset, we can apply that as the prior when a new dataset comes along, and thereby improve our estimates of the parameter and shrink the error bars.

But how exactly does it work? I asked this of Tom Loredo in the context of some strange behavior of sequential applications of BEHR that Ian Evans had noticed (specifically that sequential applications of BEHR, using as prior the posterior from the preceding dataset, seemed to be dependent on the order in which the datasets were considered (which, as it happens, arose from approximating the posterior distribution before passing it on as the prior distribution to the next stage — a feature that now has been corrected)), and this is what he said:

Yes, this is a simple theorem. Suppose you have two data sets, D1 and D2, hypotheses H, and background info (model, etc.) I. Considering D2 to be the new piece of info, Bayes’s theorem is:

[1]
p(H|D1,D2) = p(H|D1) p(D2|H, D1)            ||  I
             -------------------
                    p(D2|D1)
where the “|| I” on the right is the “Skilling conditional” indicating that all the probabilities share an “I” on the right of the conditioning solidus (in fact, they also share a D1).

We can instead consider D1 to be the new piece of info; BT then reads:

[2]
p(H|D1,D2) = p(H|D2) p(D1|H, D2)            ||  I
             -------------------
                    p(D1|D2)
Now go back to [1], and use BT on the p(H|D1) factor:
p(H|D1,D2) = p(H) p(D1|H) p(D2|H, D1)            ||  I
             ------------------------
                    p(D1) p(D2|D1)

           = p(H, D1, D2)
             ------------      (by the product rule)
                p(D1,D2)
Do the same to [2]: use BT on the p(H|D2) factor:
p(H|D1,D2) = p(H) p(D2|H) p(D1|H, D2)            ||  I
             ------------------------
                    p(D2) p(D1|D2)

           = p(H, D1, D2)
             ------------      (by the product rule)
                p(D1,D2)
So the results from the two orderings are the same. In fact, in the Cox-Jaynes approach, the “axioms” of probability aren’t axioms, but get derived from desiderata that guarantee this kind of internal consistency of one’s calculations. So this is a very fundamental symmetry.

Note that you have to worry about possible dependence between the data (i.e., p(D2|H, D1) appears in [1], not just p(D2|H)). In practice, separate data are often independent (conditional on H), so p(D2|H, D1) = p(D2|H) (i.e., if you consider H as specified, then D1 tells you nothing about D2 that you don’t already know from H). This is the case, e.g., for basic iid normal data, or Poisson counts. But even in these cases dependences might arise, e.g., if there are nuisance parameters that are common for the two data sets (if you try to combine the info by multiplying *marginalized* posteriors, you may get into trouble; you may need to marginalize *after* multiplying if nuisance parameters are shared, or account for dependence some other way).

what if you had 3, 4, .. N observations? Does the order in which you apply BT affect the results?

No, as long as you use BT correctly and don’t ignore any dependences that might arise.

if not, is there a prescription on what is the Right Thing [TM] to do?

Always obey the laws of probability theory! 9-)

Tags: Bayes Theorem, Bayesian, BEHR, prior, prior propagation, Tom Loredo
Category: Bayesian, Jargon, Quotes, Stat | Comment

P Values: What They Are and How to Use Them

Sep 27th, 2007| 01:33 pm | Posted by hlee

After observing the recent discussion among CHASC, the following paper
P Values: What They Are and How to Use Them by Luc Demortier emerged from my mind.
Continue reading ‘P Values: What They Are and How to Use Them’ »

Category: arXiv, Cross-Cultural, High-Energy, Jargon, Stat | Comment

[ArXiv] 3rd week, Sept. 2007

Sep 24th, 2007| 11:13 am | Posted by hlee

In addition to Short Timescale Coronal Variability in Capella [astro-ph:0709.3093], there were a few statistically interesting preprints came during the 3rd week of Sept. Continue reading ‘[ArXiv] 3rd week, Sept. 2007’ »

Category: arXiv | Comment

When you observed zero counts, you didn’t not observe any counts

Sep 23rd, 2007| 08:28 pm | Posted by vlk

Dong-Woo, who has been playing with BEHR, noticed that the confidence bounds quoted on the source intensities seem to be unchanged when the source counts are zero, regardless of what the background counts are set to. That is, p(s|N_S,N_B) is invariant when N_S=0, for any value of N_B. This seems a bit odd, because [naively] one expects that as N_B increases, it should/ought to get more and more likely that s gets closer to 0. Continue reading ‘When you observed zero counts, you didn’t not observe any counts’ »

Tags: Banff Challenge, Bayesian, BEHR, Dong-Woo Kim, model comparison, posterior probability, zero counts
Category: Bayesian, Data Processing, Jargon, Stat | 7 Comments

Betraying your heritage

Sep 20th, 2007| 12:26 pm | Posted by vlk

[arXiv:0709.3093v1] Short Timescale Coronal Variability in Capella (Kashyap & Posson-Brown)

We recently submitted that paper to AJ, and rather ironically, I did the analysis during the same time frame as this discussion was going on, about how astronomers cannot rely on repeating observations. Ironic because the result reported there hinges on the existence of small, but persistent signal that is found in repeated observations of the same source. Doubly ironic in fact, in that just as we were backing and forthing about cultural differences I seemed to have gone and done something completely contrary to my heritage! Continue reading ‘Betraying your heritage’ »

Tags: arXiv, Capella, Cross-Cultural, HRC, overdispersion, repeatability, significance, variability, X-ray
Category: arXiv, Astro, Cross-Cultural, Quotes, Stars, Stat, Timing, X-ray | Comment

Spurious Sources

Sep 19th, 2007| 02:21 pm | Posted by vlk

[arXiv:0709.2358] Cleaning the USNO-B Catalog through automatic detection of optical artifacts, by Barron et al.

Statistically speaking, “false sources” are generally in the domain of ~~Type II~~ Type I errors, defined by the probability of detecting a signal where there is none. But what if there is a clear signal, but it is not real? Continue reading ‘Spurious Sources’ »

Tags: arXiv, catalog, diffraction spikes, false sources, instrumental features, Stars, USNO
Category: arXiv, Astro, Data Processing, Imaging, Optical, Stars, Uncertainty | 2 Comments

VOConvert (ConVOT)

Sep 17th, 2007| 03:36 pm | Posted by hlee

VOConvert or ConVOT is a small java script which does file format conversion from fits to ascii or the other way around. These tools might be useful for statisticians who want to convert astronomers’ data format called fits into ascii quickly for a statistical analysis. Additionally, VOConvert creates an interim output for VOStat, designed for statistical data analysis from Virtual Observatory. The softwares and the list of Virtual Observatories around the world can be found at Virtual Observatory India. Please, check a link in VOstat (http://hea-www.harvard.edu/AstroStat/slog/2007/vostat) for more information about VOstat.

Tags: FITS
Category: Algorithms, Astro, Cross-Cultural, Data Processing | Comment

PHYSTAT-LHC 2007

Sep 15th, 2007| 01:33 am | Posted by hlee

The idea that some useful materials related to the Chandra calibration problem, which CHASC is putting an effort to, could be found from PHYSTAT conferences came along. Owing to the recent advanced technologies adopted by physicists (I haven’t seen any statistical conference offers what I obtained from PHYSTAT-LHC 2007), I had a chance to go through some video files from PHYSTAT-LHC 2007. The files are the recorded lectures and lecture notes. They are available from PHYSTAT-LHC 2007 Program.
Continue reading ‘PHYSTAT-LHC 2007’ »

Category: Cross-Cultural, News, Quotes | Comment

[ArXiv] 2nd week, Sept. 2007

Sep 14th, 2007| 08:46 pm | Posted by hlee

In addition to preprints discussed in [ArXiv] Swift and XMM measurement errors, [ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007, and [ArXiv] Recent bayesian studies from astro-ph, I wish to point a few more out from this week.
Continue reading ‘[ArXiv] 2nd week, Sept. 2007’ »

Category: Algorithms, arXiv, Astro, Stat | Comment

How to subscribe to the arXiv email list service

Sep 12th, 2007| 05:50 pm | Posted by hlee

Over the years, I noticed the exponential increase of statistical applications from astronomical papers. Keeping the track of them and writing summaries based on daily arXiv updates for the slog over the past few months has become a quite overwhelming task for a single person. Therefore, instead of offering fish, I decide to offer how to catch fish.
Continue reading ‘How to subscribe to the arXiv email list service’ »

Tags: email, fish, net, subscription
Category: arXiv | Comment

Visualizing Astronomy

Sep 12th, 2007| 05:21 pm | Posted by vlk

The CXC Education & Outreach Program at the CfA hosts a series of lectures on Visualizing Astronomy, and the first of this season’s is scheduled for Sep 18 at 1:30pm at Phillips:

Date & Time: Tuesday, September 18, 1:30pm
Location: Phillips Auditorium
Speaker: Alyssa Goodman (Harvard)
Title: Amazing New Tools for Exploring Astronomical Data

Continue reading ‘Visualizing Astronomy’ »

Category: Astro, Imaging, News | Comment

[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007

Sep 12th, 2007| 04:31 pm | Posted by hlee

From arxiv/astro-ph:0709.1359,
A robust morphological classification of high-redshift galaxies using support vector machines on seeing limited images. I Method description by M. Huertas-Company et al.

Machine learning and statistical learning become more and more popular in astronomy. Artificial Neural Network (ANN) and Support Vector Machine (SVM) are hardly missed when classifying on massive survey data is the objective. The authors provide a gentle tutorial on SVM for galactic morphological classification. Their source code GALSVM is linked for the interested readers.
Continue reading ‘[ArXiv] SVM and galaxy morphological classification, Sept. 10, 2007’ »

Tags: Classification, learning, morphology, SVM
Category: Algorithms, arXiv, Astro, Galaxies, Methods | Comment

[ArXiv] Bimodal Color Distribution in GCS, Sept. 7, 2007

Sep 12th, 2007| 03:56 pm | Posted by hlee

From arxiv/astro-ph:0709.1073v1
On the Metallicity-Color Relations and Bimodal Color Distributions in Extragalactic Globular Cluster Systems by M. Cantiello and J. P. Blakeslee

Many observations on globular cluster systems (GCS) show bimodal distributions in color and metallicity space. The authors discussed the complication of non-linear metalicity and color relations and presented their careful study to suggest the optimal color(s) for revealing the presence of real bimodal GC metallicity distributions. Based on their simulation study, (V-H) and (V-K) are confirmed to be good colors for revealing unbiased bimodal metallicity distributions in GCS.
Continue reading ‘[ArXiv] Bimodal Color Distribution in GCS, Sept. 7, 2007’ »

Category: arXiv, Astro, Fitting | Comment

[ArXiv] Recent bayesian studies from astro-ph

Sep 11th, 2007| 03:38 am | Posted by hlee

In the past month, I’ve noticed relatively frequent paper appearance in arxiv/astro-ph whose title includes Bayesian or Markov Chain Monte Carlo (MCMC). Those papers are:

[astro-ph:0709.1058v1] Joint Bayesian Component Separation and CMB Power Spectrum Estimation by H.K.Eriksen et. al.
[astro-ph:0709.1104v1] Monolithic or hierarchical star formation? A new statistical analysis by M. Kampakoglou, R. Trotta, and J. Silk
[astro-ph:0411573v2] A Bayesian analysis of the primordial power spectrum by M.Bridges, A.N.Lasenby, M.P.Hobson
[astro-ph:0709.0596v1] Bayesian inversion of Stokes profiles by A. A. Ramos, M.J.M. Gonzales, and J.A. Rubino-Martin
[astro-ph:0709.0711v1] Bayesian posterior classification of planetary nebulae according to the Peimbert types by C. Quireza, H.J.Rocha-Pinto, and W.J. Maciel
[astro-ph:0708.2340v1] Bayesian Galaxy Shape Measurement for Weak Lensing Surveys -I. Methodology and a Fast Fitting Algorithm by L. Miller et. al.
[astro-ph:0708.1871v1] Dark energy and cosmic curvature: Monte-Carlo Markov Chain approach by Y. Gong et. al.

Continue reading ‘[ArXiv] Recent bayesian studies from astro-ph’ »

Category: arXiv, Bayesian, MCMC, Methods, Stat | Comment