The AstroStat Slog

Archive for the ‘Jargon’ Category.

Line Emission [EotW]

May 14th, 2008| 01:00 pm | Posted by vlk

Spectral lines are a ubiquitous feature of astronomical data. This week, we explore the special case of optically thin emission from low-density and high-temperature plasma, and consider the component factors that determine the line intensity. Continue reading ‘Line Emission [EotW]’ »

Tags: abundance, emission, emission measure, emissivity, EotW, Equation, Equation of the Week, flux, ion balance, line
Category: Astro, Jargon, Misc, Physics, Stars | 2 Comments

Equation of the Week: Confronting data with model

May 2nd, 2008| 06:06 pm | Posted by vlk

Starting a new feature — highlighting some equation that is widely used in astrophysics or astrostatistics. Today’s featured equation: what instruments do to incident photons. Continue reading ‘Equation of the Week: Confronting data with model’ »

Tags: ARF, effective area, EotW, Equation, Equation of the Week, LRF, point spread function, PSF, response matrix, RMF, source
Category: Astro, Jargon | Comment

[ArXiv] 1st week, Apr. 2008

Apr 6th, 2008| 11:10 am | Posted by hlee

I’m very curious how astronomers began to use Monte Carlo Markov Chain instead of Markov chain Monte Carlo. The more it becomes popular, the more frequently Monte Carlo Markov Chain appears. Anyway, this week, I added non astrostatistical papers in the list: a tutorial, big bang, and biblical theology. Continue reading ‘[ArXiv] 1st week, Apr. 2008’ »

Tags: Bible, big bang, FFT, IMF, microlensing, misnomer, model, NGC 602, power law, Stellar association, wavelet
Category: arXiv, Jargon, MCMC, Misc | Comment

[ArXiv]4th week, Mar. 2008

Mar 30th, 2008| 07:51 pm | Posted by hlee

The numbers of astro-ph preprints on average have been decreased so as my hours of reading abstracts…. cool!!! By the way, there is a paper about solar cycle, PCA, ICA, and Lomb-Scargle periodogram. Continue reading ‘[ArXiv]4th week, Mar. 2008’ »

Tags: auxiliary, ICA, IMF, Lomb-Scargle periodogram, misnomer, nuisance parameter, PCA, solar cycle
Category: arXiv, Jargon | Comment

[ArXiv] 3rd week, Mar. 2007

Mar 21st, 2008| 06:20 pm | Posted by hlee

Markov chain Monte Carlo (MCMC) never misses a week from recently astro-ph. A book titled MCMC in astronomy will be a best seller. There are, in addition, very interesting non MCMC preprints. Continue reading ‘[ArXiv] 3rd week, Mar. 2007’ »

Tags: chi-sq, Fourier Analysis, GREAT08, lensing, likelihood, misnomer, Poisson noisy image, sparse
Category: arXiv, Cross-Cultural, Jargon, MCMC | Comment

Eddington versus Malmquist

Mar 13th, 2008| 01:53 pm | Posted by vlk

During the runup to his recent talk on logN-logS, Andreas mentioned how sometimes people are confused about the variety of statistical biases that afflict surveys. They usually know what the biases are, but often tend to mislabel them, especially the Eddington and Malmquist types. Sort of like using “your” and “you’re” interchangeably, which to me is like nails on a blackboard. So here’s a brief summary: Continue reading ‘Eddington versus Malmquist’ »

Tags: bias, detection threshold, Eddington, faint source fluctuations, logN-logS, luminosity function, Malmquist
Category: Astro, Bad AstroStat, Jargon | 3 Comments

[ArXiv] 1st week, Mar. 2008

Mar 7th, 2008| 06:01 pm | Posted by hlee

Irrelevant to astrostatistics but interesting for baseball lovers.
[stat.AP:0802.4317] Jensen, Shirley, & Wyner
Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball

With the 5th year WMAP data release, there were many WMAP related papers and among them, most statistical papers are listed. Continue reading ‘[ArXiv] 1st week, Mar. 2008’ »

Tags: baseball, cosmology, MLE, tessellation, void, WMAP, XMM
Category: arXiv, Bayesian, Cross-Cultural, Fitting, Jargon, MCMC | Comment

Dance of the Errors

Jan 21st, 2008| 03:33 pm | Posted by vlk

One of the big problems that has come up in recent years is in how to represent the uncertainty in certain estimates. Astronomers usually present errors as +-stddev on the quantities of interest, but that presupposes that the errors are uncorrelated. But suppose you are estimating a multi-dimensional set of parameters that may have large correlations amongst themselves? One such case is that of Differential Emission Measures (DEM), where the “quantity of emission” from a plasma (loosely, how much stuff there is available to emit — it is the product of the volume and the densities of electrons and H) is estimated for different temperatures. See the plots at the PoA DEM tutorial for examples of how we are currently trying to visualize the error bars. Another example is the correlated systematic uncertainties in effective areas (Drake et al., 2005, Chandra Cal Workshop). This is not dissimilar to the problem of determining the significance of a “feature” in an image (Connors, A. & van Dyk, D.A., 2007, SCMA IV). Continue reading ‘Dance of the Errors’ »

Tags: animated, David Garcia-Alvarez, DEM, error bands, error bars, flux, MCMC, O VII, O VIII, PINTofALE, question for statisticians
Category: Algorithms, Astro, Data Processing, Jargon, MCMC, Spectral, Stars, Uncertainty | 2 Comments

~ Avalanche(a,b)

Oct 13th, 2007| 09:14 pm | Posted by vlk

Avalanches are a common process, occuring anywhere that a system can store stress temporarily without “snapping”. It can happen on sand dunes and solar flares as easily as on the snow bound Alps.

Melatos, Peralta, & Wyithe (arXiv:0710.1021) have a nice summary of avalanche processes in the context of pulsar glitches. Their primary purpose is to show that the glitches are indeed consistent with an avalanche, and along the way they give a highly readable description of what an avalanche is and what it entails. Briefly, avalanches result in event parameters that are distributed in scale invariant fashion (read: power laws) with exponential waiting time distributions (i.e., Poisson).

Hence the title of this post: the “Avalanche distribution” (indulge me! I’m using stats notation to bury complications!) can be thought to have two parameters, both describing the indices of power-law distributions that control the event sizes, a, and the event durations, b, and where the event separations are distributed as an exponential decay. Is there a canned statistical distribution that describes all this already? (In our work modeling stellar flares, we assumed that b=0 and found that ~~a>2~~ a<-2, which has all sorts of nice consequences for coronal heating processes.)

Tags: arXiv, avalanche, flares, pulsar glitches, pulsars, self organized criticality
Category: arXiv, Astro, Jargon, Objects, Stars, Timing | Comment

model vs model

Oct 5th, 2007| 01:38 pm | Posted by vlk

As Alanna pointed out, astronomers and statisticians mean different things when they say “model”. To complicate matters, we have also started to use another term called “data model”. Continue reading ‘model vs model’ »

Tags: data, data model, DM, IVOA, Jargon, model, virtual observatory, VOA
Category: Astro, Cross-Cultural, Data Processing, Jargon | 2 Comments

ab posteriori ad priori

Sep 29th, 2007| 06:03 pm | Posted by vlk

A great advantage of Bayesian analysis, they say, is the ability to propagate the posterior. That is, if we derive a posterior probability distribution function for a parameter using one dataset, we can apply that as the prior when a new dataset comes along, and thereby improve our estimates of the parameter and shrink the error bars.

But how exactly does it work? I asked this of Tom Loredo in the context of some strange behavior of sequential applications of BEHR that Ian Evans had noticed (specifically that sequential applications of BEHR, using as prior the posterior from the preceding dataset, seemed to be dependent on the order in which the datasets were considered (which, as it happens, arose from approximating the posterior distribution before passing it on as the prior distribution to the next stage — a feature that now has been corrected)), and this is what he said:

Yes, this is a simple theorem. Suppose you have two data sets, D1 and D2, hypotheses H, and background info (model, etc.) I. Considering D2 to be the new piece of info, Bayes’s theorem is:

[1]
p(H|D1,D2) = p(H|D1) p(D2|H, D1)            ||  I
             -------------------
                    p(D2|D1)
where the “|| I” on the right is the “Skilling conditional” indicating that all the probabilities share an “I” on the right of the conditioning solidus (in fact, they also share a D1).

We can instead consider D1 to be the new piece of info; BT then reads:

[2]
p(H|D1,D2) = p(H|D2) p(D1|H, D2)            ||  I
             -------------------
                    p(D1|D2)
Now go back to [1], and use BT on the p(H|D1) factor:
p(H|D1,D2) = p(H) p(D1|H) p(D2|H, D1)            ||  I
             ------------------------
                    p(D1) p(D2|D1)

           = p(H, D1, D2)
             ------------      (by the product rule)
                p(D1,D2)
Do the same to [2]: use BT on the p(H|D2) factor:
p(H|D1,D2) = p(H) p(D2|H) p(D1|H, D2)            ||  I
             ------------------------
                    p(D2) p(D1|D2)

           = p(H, D1, D2)
             ------------      (by the product rule)
                p(D1,D2)
So the results from the two orderings are the same. In fact, in the Cox-Jaynes approach, the “axioms” of probability aren’t axioms, but get derived from desiderata that guarantee this kind of internal consistency of one’s calculations. So this is a very fundamental symmetry.

Note that you have to worry about possible dependence between the data (i.e., p(D2|H, D1) appears in [1], not just p(D2|H)). In practice, separate data are often independent (conditional on H), so p(D2|H, D1) = p(D2|H) (i.e., if you consider H as specified, then D1 tells you nothing about D2 that you don’t already know from H). This is the case, e.g., for basic iid normal data, or Poisson counts. But even in these cases dependences might arise, e.g., if there are nuisance parameters that are common for the two data sets (if you try to combine the info by multiplying *marginalized* posteriors, you may get into trouble; you may need to marginalize *after* multiplying if nuisance parameters are shared, or account for dependence some other way).

what if you had 3, 4, .. N observations? Does the order in which you apply BT affect the results?

No, as long as you use BT correctly and don’t ignore any dependences that might arise.

if not, is there a prescription on what is the Right Thing [TM] to do?

Always obey the laws of probability theory! 9-)

Tags: Bayes Theorem, Bayesian, BEHR, prior, prior propagation, Tom Loredo
Category: Bayesian, Jargon, Quotes, Stat | Comment

P Values: What They Are and How to Use Them

Sep 27th, 2007| 01:33 pm | Posted by hlee

After observing the recent discussion among CHASC, the following paper
P Values: What They Are and How to Use Them by Luc Demortier emerged from my mind.
Continue reading ‘P Values: What They Are and How to Use Them’ »

Category: arXiv, Cross-Cultural, High-Energy, Jargon, Stat | Comment

When you observed zero counts, you didn’t not observe any counts

Sep 23rd, 2007| 08:28 pm | Posted by vlk

Dong-Woo, who has been playing with BEHR, noticed that the confidence bounds quoted on the source intensities seem to be unchanged when the source counts are zero, regardless of what the background counts are set to. That is, p(s|N_S,N_B) is invariant when N_S=0, for any value of N_B. This seems a bit odd, because [naively] one expects that as N_B increases, it should/ought to get more and more likely that s gets closer to 0. Continue reading ‘When you observed zero counts, you didn’t not observe any counts’ »

Tags: Banff Challenge, Bayesian, BEHR, Dong-Woo Kim, model comparison, posterior probability, zero counts
Category: Bayesian, Data Processing, Jargon, Stat | 7 Comments

[ArXiv] NGC 6397 Deep ACS Imaging, Aug. 29, 2007

Sep 5th, 2007| 02:26 am | Posted by hlee

From arxiv/astro-ph:0708.4030v1
Deep ACS Imaging in the Globular Cluster NGC 6397: The Cluster Color Magnitude Diagram and Luminosity Function by H.B. Richer et.al

This paper presented an observational study of a globular cluster, named NGC 6397, enhanced and more informative compared to previous observations in a sense that 1) a truncation in the white dwarf cooling sequence occurs at 28 magnitude, 2) the cluster main sequence seems to terminate approximately at the hydrogen-burning limit predicted by two independent stellar evolution models, and 3) luminosity functions (LFs) or mass functions (MFs) are well defined. Nothing statistical, but the idea of defining color magnitude diagrams (CMDs) and LFs described in the paper, will assist developing suitable statistics on CMD and LF fitting problems in addition to the improved measurements (ACS imaging) of stars in NGC 6397.
Continue reading ‘[ArXiv] NGC 6397 Deep ACS Imaging, Aug. 29, 2007’ »

Tags: ACS, chi-square, CMD, empirical method, globular cluster, LF, MF, NGC
Category: arXiv, Astro, CHASC, Data Processing, Imaging, Jargon, Stars | Comment

[ArXiv] A Lecture Note, June 17, 2007

Jun 18th, 2007| 03:06 pm | Posted by hlee

From arxiv/astro-ph:0706.1988,
Lectures on Astronomy, Astrophysics, and Cosmology looks helpful to statisticians who like to know astronomy, astrophysics, and cosmology. The lecture note starts from introducing fundamentals of astronomy, UNITS!!!, and its history. It also explains astronomical measures such as distances and their units, luminosity, and temperature; HR diagram (astronomers’ summary diagram); stellar evolution; and relevant topics in cosmology. At least, a third of the article will be useful to grasp a rough idea of astronomy as a scientific subject beyond colorful pictures. Statisticians who are keen to cosmology are recommended to read beyond.

This is not a high energy lecture note; therefore, statisticians interested in high energy are encouraged to visit Astro Jargon for Statisticians and CHASC.

Tags: CHASC, cosmology, Jargon, lecture note
Category: arXiv, CHASC, Cross-Cultural, Jargon | Comment