The AstroStat Slog

Archive for the ‘Uncertainty’ Category.

[Book] The Elements of Statistical Learning, 2nd Ed.

Jul 22nd, 2010| 09:25 am | Posted by hlee

This was written more than a year ago, and I forgot to post it.
Continue reading ‘[Book] The Elements of Statistical Learning, 2nd Ed.’ »

Tags: book, Brieman, cigar, Clinton, data mining, Friedman, Hastie, KDD, light curve, machine learning, SCMA, shaking hands, SN, statistical learning, Supernova, Tibshirani
Category: Algorithms, Cross-Cultural, High-Energy, Jargon, Methods, Quotes, Stat, Uncertainty | Comment

An Instructive Challenge

Jun 15th, 2010| 02:38 pm | Posted by vlk

This question came to the CfA Public Affairs office, and I am sharing it with y’all because I think the solution is instructive.

A student had to figure out the name of a stellar object as part of an assignment. He was given the following information about it:

apparent [V] magnitude = 5.76
B-V = 0.02
E(B-V) = 0.00
parallax = 0.0478 arcsec
radial velocity = -18 km/s
redshift = 0 km/s

He looked in all the stellar databases but was unable to locate it, so he asked the CfA for help.

Just to help you out, here are a couple of places where you can find comprehensive online catalogs:

See if you can find it!

Continue reading ‘An Instructive Challenge’ »

Tags: astro catalogs, Challenge, data, question
Category: Astro, Jargon, Objects, Stars, Uncertainty | Comment

A short note on Probability for astronomers

Dec 27th, 2009| 10:13 pm | Posted by hlee

I often feel irksome whenever I see a function being normalized over a feasible parameter space and it being used as a probability density function (pdf) for further statistical inference. In order to be a suitable pdf, normalization has to be done over a measurable space not over a feasible space. Such practice often yields biased best fits (biased estimators) and improper error bars. On the other hand, validating a measurable space under physics seems complicated. To be precise, we often lost in translation. Continue reading ‘A short note on Probability for astronomers’ »

Tags: axiom, curriculum, education, google university, hope, measurable, probability
Category: Algorithms, arXiv, Cross-Cultural, Jargon, Methods, Quotes, Stat, Uncertainty | Comment

From Quantile Probability and Statistical Data Modeling

Nov 21st, 2009| 05:06 am | Posted by hlee

by Emanuel Parzen in Statistical Science 2004, Vol 19(4), pp.652-662 JSTOR

I teach that statistics (done the quantile way) can be simultaneously frequentist and Bayesian, confidence intervals and credible intervals, parametric and nonparametric, continuous and discrete data. My first step in data modeling is identification of parametric models; if they do not fit, we provide nonparametric models for fitting and simulating the data. The practice of statistics, and the modeling (mining) of data, can be elegant and provide intellectual and sensual pleasure. Fitting distributions to data is an important industry in which statisticians are not yet vendors. We believe that unifications of statistical methods can enable us to advertise, “What is your question? Statisticians have answers!”

I couldn’t help liking this paragraph because of its bitter-sweetness. I hope you appreciate it as much as I did.

Tags: modeling, Parzen, quantile
Category: arXiv, Bayesian, Fitting, Frequentist, Jargon, Methods, Stat, Uncertainty | Comment

The chance that A has nukes is p%

Oct 23rd, 2009| 12:26 pm | Posted by hlee

I watched a movie in which one of the characters said, “country A has nukes with 80% chance” (perhaps, not 80% but it was a high percentage). One of the statements in that episode is that people will not eat lettuce only if the 1% chance of e coli is reported, even lower. Therefore, with such a high percentage of having nukes, it is right to send troops to A. This episode immediately brought me a thought about astronomers’ null hypothesis probability and their ways of concluding chi-square goodness of fit tests, likelihood ratio tests, or F-tests.

First of all, I’d like to ask how you would like to estimate the chance of having nukes in a country? What this 80% implies here? But, before getting to the question, I’d like to discuss computing the chance of e coli infection, first. Continue reading ‘The chance that A has nukes is p%’ »

Tags: chances, chi-square statistic, composite likelihood, delta chi-square, F-test, fiducial likelihood, likelihood, LRT, p-value, posterior, prior
Category: Bayesian, Cross-Cultural, Fitting, Frequentist, Misc, Quotes, Uncertainty | Comment

Scatter plots and ANCOVA

Oct 15th, 2009| 06:46 pm | Posted by hlee

Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists^[1]. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »

This is not an assuring absolute statement but a personal impression after reading articles of various fields in addition to astronomy. My readings of other fields tell that many rely on correlation statistics but less scatter plots by adding straight lines going through data sets for the purpose of imposing relationships within variable pairs[↩]

Tags: ANCOVA, ANOVA, approximation, correlation, Gaussianity, graphics, MADS, modeling, nonparametric, parallel coordinates, PCA, quality, quantity, regression, scatter plots
Category: arXiv, Cross-Cultural, Fitting, Jargon, Methods, Stat, Uncertainty | Comment

[MADS] ARCH

Sep 4th, 2009| 01:30 pm | Posted by hlee

ARCH (autoregressive conditional heteroscedasticity) is a statistical model that considers the variance of the current error term to be a function of the variances of the previous time periods’ error terms. I heard that this model made Prof. Engle a Nobel prize recipient. Continue reading ‘[MADS] ARCH’ »

Tags: ARCH, econometrics, Engle, GARCH, heteroscedasticity, MADS
Category: Cross-Cultural, Methods, Stat, Timing, Uncertainty | Comment

different views

Jul 12th, 2009| 07:33 pm | Posted by hlee

An email was forwarded with questions related to the data sets found in “Be an INTEGRAL astronomer”. Among the sets, the following scatter plot is based on the Crab data.

Continue reading ‘different views’ »

Tags: ANOVA, block design, crab nebula, F-test, gaussinity, integral, light curve
Category: Astro, Cross-Cultural, gamma-ray, High-Energy, Jargon, Objects, Uncertainty | Comment

how to trace?

Jun 11th, 2009| 03:52 pm | Posted by hlee

I was at the SUSY 09 public lecture given by a Nobel laureate, Frank Wilczek of QCD (quantum chromodynamics). As far as I know SUSY is the abbreviation of SUperSYmetricity in particle physics. Finding such antimatter(? I’m afraid I read “Angels and Demons” too quickly) will explain the unification theory among electromagnetic, weak, and strong forces and even the gravitation according to the speaker’s graph. I’ll not go into the details of particle physics and the standard model. The reason is too obvious. Instead, I’d like to show this image from wikipedia and to discuss my related questions.
particle_trace Continue reading ‘how to trace?’ »

Tags: cliche, collion, identifiability, identification, irony, LHC, Power, reconstruction, source detection, subparticle, supersymmetry, SUSY, TRACE, type I error, Type II error, uncertainty principle, unification, youtube
Category: Cross-Cultural, Data Processing, High-Energy, Misc, Quotes, Uncertainty | Comment

Curious Cases of the Null Hypothesis Probability

Jun 2nd, 2009| 03:03 am | Posted by hlee

Even though I traced the astronomers’ casual usage of the null hypothesis probability in a fashion of reporting outputs from data analysis packages of their choice, there were still some curious cases of the null hypothesis probability that I couldn’t solve. They are quite mysterious to me. Sometimes too much creativity harms the original intention. Here are some examples. Continue reading ‘Curious Cases of the Null Hypothesis Probability’ »

Tags: cases, chi-sq, curious, degree of freedom, dof, F-test, goodness-of-fit test, Model Selection, null hypothesis probability, p-value, reduced chi-sq
Category: arXiv, Astro, Cross-Cultural, Fitting, Methods, Uncertainty | 3 Comments

[MADS] Law of Total Variance

May 28th, 2009| 11:54 pm | Posted by hlee

This simple law, despite my trial of full text search, was not showing in ADS. As discussed in systematic errors, astronomers, like physicists, show their error components in two additive terms; statistical error + systematic error. To explain such decomposition and to make error analysis statistically rigorous, the law of total variance (LTV) seems indispensable. Continue reading ‘[MADS] Law of Total Variance’ »

Tags: additive, bias, law of total variance, MADS, mathematical statistics, mean integrated square error, mean square error, MISE, mse, probability theory, variance
Category: Algorithms, Data Processing, Jargon, News, Stat, Uncertainty | Comment

Robust Statistics

May 18th, 2009| 12:18 pm | Posted by hlee

My understandings of “robustness” from the education in statistics and from communicating with astronomers are hard to find a mutual interest. Can anyone help me to build a robust bridge to get over this abyss? Continue reading ‘Robust Statistics’ »

Tags: break point, Huber, nonparametric, robust, Rousseeuw, Tukey
Category: Bayesian, Frequentist, Jargon, MCMC, Methods, Quotes, Stat, Uncertainty | Comment

a century ago

May 7th, 2009| 02:22 pm | Posted by hlee

Almost 100 years ago, A.S. Eddington stated in his book Stellar Movements (1914) that

…in calculating the mean error of a series of observations it is preferable to use the simple mean residual irrespective of sign rather than the mean square residual

Such eminent astronomer said already least absolute deviation over chi-square, if I match simple mean residual and mean square residual to relevant methodologies, in order. Continue reading ‘a century ago’ »

Tags: chi-square minimization, Eddington, inference, LAD, Laplace, mse, PyMC, Python, R.A.Fisher, utility function
Category: Astro, Cross-Cultural, Quotes, Stat, Uncertainty | Comment

[Book] The Physicists

Apr 22nd, 2009| 02:02 pm | Posted by hlee

I was reading Lehmann’s memoir on his friends and colleagues who influence a great deal on establishing his career. I’m happy to know that his meeting Landau, Courant, and Evans led him to be a statistician; otherwise, we, including astronomers, would have had very different textbooks and statistical thinking would have been different. On the other hand, I was surprised to know that he chose statistics over physics due to his experience from Cambridge (UK). I thought becoming a physicist is more preferred than becoming a statistician during the first half of the 20th century. At least I felt that way, probably it’s because more general science books in physics and physics related historic events were well exposed so that I became to think that physicists are more cooler than other type scientists. Continue reading ‘[Book] The Physicists’ »

Tags: book, Durrenmatt, E. L. Lehmann, Heisenberg, physicists, statistician, uncertainty principle
Category: Cross-Cultural, Misc, Physics, Quotes, Stat, Uncertainty | Comment

[MADS] plug-in estimator

Apr 20th, 2009| 09:34 pm | Posted by hlee

I asked a couple of astronomers if they heard the term plug-in estimator and none of them gave me a positive answer. Continue reading ‘[MADS] plug-in estimator’ »

Tags: biased, breakdown point, chi-square, confidence interval, coverage, delta chi-square, estimator, LAD, mean, median, plug-in, rmse
Category: Bad AstroStat, Cross-Cultural, Data Processing, Jargon, Uncertainty | 2 Comments