The AstroStat Slog

Archive for the ‘Data Processing’ Category.

[MADS] Adaptive filter

Jun 4th, 2009| 04:36 pm | Posted by hlee

Please, do not confuse adaptive filter (hereafter, AF) with adaptive optics (hereafter, AO). I have no expertise in both fields but have small experiences to tell you the difference. Simply put, AF is comparable to software as opposed to AO to hardware, which is for constructing telescopes in order to collect data with sharpness and to minimize time varying atmospheric blurring. When you search adaptive filter in ADS you’ll more likely come across with adaptive optics and notch filter. Continue reading ‘[MADS] Adaptive filter’ »

Tags: adaptive filter, adaptive optics, MADS
Category: Algorithms, arXiv, Data Processing | Comment

[MADS] Law of Total Variance

May 28th, 2009| 11:54 pm | Posted by hlee

This simple law, despite my trial of full text search, was not showing in ADS. As discussed in systematic errors, astronomers, like physicists, show their error components in two additive terms; statistical error + systematic error. To explain such decomposition and to make error analysis statistically rigorous, the law of total variance (LTV) seems indispensable. Continue reading ‘[MADS] Law of Total Variance’ »

Tags: additive, bias, law of total variance, MADS, mathematical statistics, mean integrated square error, mean square error, MISE, mse, probability theory, variance
Category: Algorithms, Data Processing, Jargon, News, Stat, Uncertainty | Comment

space weather

May 21st, 2009| 05:55 pm | Posted by hlee

Among billion objects in our Galaxy, outside the Earth, our Sun drags most attention from astronomers. These astronomers go by solar physicists, who enjoy the most abundant data including 400 year long sunspot counts. Their joy is not only originated from the fascinating, active, and unpredictable characteristics of the Sun but also attributed to its influence on our daily lives. Related to the latter, sometimes studying the conditions on the Sun is called space weather forecast. Continue reading ‘space weather’ »

Tags: classifier, forecast, logistic regression, machine learning, predictor, response, space weather, Sun, sunspot, SVM, test data, training data, weather
Category: arXiv, Astro, Cross-Cultural, Data Processing, Imaging, Jargon, Stars, Stat | Comment

[ArXiv] Sparse Poisson Intensity Reconstruction Algorithms

May 7th, 2009| 11:14 am | Posted by hlee

One of [ArXiv] papers from yesterday whose title might drag lots of attentions from astronomers. Furthermore, it’s a short paper.
[arxiv:math.CO:0905.0483] by Harmany, Marcia, and Willet.
Continue reading ‘[ArXiv] Sparse Poisson Intensity Reconstruction Algorithms’ »

Tags: compressed sensing, decomposition, EM algorithm, intensity, MPLE, multiscale, penalty, Poisson, Poisson Intensity, Sparcity, wavelet
Category: Algorithms, arXiv, Astro, Cross-Cultural, Data Processing, High-Energy, Imaging, Jargon | Comment

[MADS] plug-in estimator

Apr 20th, 2009| 09:34 pm | Posted by hlee

I asked a couple of astronomers if they heard the term plug-in estimator and none of them gave me a positive answer. Continue reading ‘[MADS] plug-in estimator’ »

Tags: biased, breakdown point, chi-square, confidence interval, coverage, delta chi-square, estimator, LAD, mean, median, plug-in, rmse
Category: Bad AstroStat, Cross-Cultural, Data Processing, Jargon, Uncertainty | 2 Comments

[MADS] Chernoff face

Apr 2nd, 2009| 12:00 pm | Posted by hlee

I cannot remember when I first met Chernoff face but it hooked me up instantly. I always hoped for confronting multivariate data from astronomy applicable to this charming EDA method. Then, somewhat such eager faded, without realizing what’s happening. Tragically, this was mainly due to my absent mind. Continue reading ‘[MADS] Chernoff face’ »

Tags: calibration, Capella, Chandra, Chernoff face, EDA, line ratios, MADS, XAtlas
Category: Algorithms, arXiv, Astro, Cross-Cultural, Data Processing, Jargon, Methods, Misc, News, Quotes, Spectral, Stars, X-ray | 2 Comments

Use and Misuse of Chi-square

Mar 31st, 2009| 03:43 pm | Posted by hlee

Before using any adaptations of chi-square statistic, please spend a minute or two to ponder whether your strategy with chi-square belongs one of these categories.

1. Lack of independence among the single events or measures
2. Small theoretical frequencies
3. Neglect of frequencies of non-occurrence
4. Failure to equalize \sum O_i (the sum of the observed frequencies) and \sum M_i (the sum of the theoretical frequencies)
5. Indeterminate theoretical frequencies
6. Incorrect or questionable categorizing
7. Use of non-frequency data
8. Incorrect determination of the number of degrees of freedom
9. Incorrect computations (including a failure to weight by N when proportions instead of frequencies are used in the calculations)

From “Chapter 10: On the Use and Misuse of Chi-square” by K.L.Delucchi in A Handbook for Data Analysis in the Behavioral Sciences (1993). Delucchi acknowledged these nine principle sources of error to Lewis and Burke (1949), entitled “The Use and Misuse of the Chi-square” published in Psychological Bulletin. Continue reading ‘Use and Misuse of Chi-square’ »

Tags: chi-square, chi-square statistic, degrees-of-freedom, misuse, use
Category: arXiv, Bad AstroStat, Cross-Cultural, Data Processing, Stat | 1 Comment

[Book] Elements of Information Theory

Mar 11th, 2009| 01:04 pm | Posted by hlee

by T. Cover and J. Thomas website: http://www.elementsofinformationtheory.com/

Once, perhaps more, I mentioned this book in my post with the most celebrated paper by Shannon (see the posting). Some additional recommendation of the book has been made to answer offline inquiries. And this book always has been in my favorite book list that I like to use for teaching. So, I’m not shy with recommending this book to astronomers with modern objective perspectives and practicality. Before advancing for more praises, I must say that those admiring words do not imply that I understand every line and problem of the book. Like many fields, Information theory has grown fast since the monumental debut paper by Shannon (1948) like the speed of astronomers observation techniques. Without the contents of this book, most of which came after Shannon (1948), internet, wireless communication, compression, etc could not have been conceived. Since the notion of “entropy“, the core of information theory, is familiar to astronomers (physicists), the book would be received better among them than statisticians. This book should be read easier to astronomers than statisticians. Continue reading ‘[Book] Elements of Information Theory’ »

Tags: bandwidth, book, Cover, data mining, education, Entropy, Information theory, Kolmogorov complexity, Shannon, Thomas
Category: Algorithms, arXiv, Cross-Cultural, Data Processing, Jargon, Quotes | Comment

systematic errors

Mar 6th, 2009| 03:42 pm | Posted by hlee

Ah ha~ Once I questioned, “what is systematic error?” (see [Q] systematic error.) Thanks to L. Lyons’ work discussed in [ArXiv] Particle Physics, I found this paper, titled Systematic Errors describing the concept and statistical inference related to systematic errors in the field of particle physics. It, gladly, shares lots of similarity with high energy astrophysics. Continue reading ‘systematic errors’ »

Tags: coverage, Heinrich, likelihood, Lyons, nuisance parameter, objective priors, p-value, particle physics, statistical error, subjective priors, systematic error
Category: Algorithms, arXiv, Bayesian, Cross-Cultural, Data Processing, Frequentist, Jargon, Misc, News, Physics, Stat, Uncertainty | Comment

An excerpt from …

Feb 26th, 2009| 04:07 pm | Posted by hlee

I’ve been complaining about how one can do machine learning on solar images without a training set? (see my comment at the big picture). On the other hand, I’m also aware of challenges in astronomy that data (images) cannot be transformed freely and be fed into standard machine learning algorithms. Tailoring data pipelining, cleaning, and processing to currently existing vision algorithms may not be achievable. The hope of automatizing the detection/identification procedure of interesting features (e.g. flares and loops) and forecasting events on the surface of the Sun is only a dream. Even though the level of image data stream is that of tsunami, we might have to depend on human eyes to comb out interesting features on the Sun until the new paradigm of automatized feature identification algorithms based on a single image i.e. without a training set. The good news is that human eyes have done a superb job! Continue reading ‘An excerpt from …’ »

Tags: brains, computer vision, human eyes, Kendall, machine learning, shape theory, Sun, tsunami
Category: arXiv, Astro, Cross-Cultural, Data Processing, Imaging, Quotes | Comment

[ArXiv] Particle Physics

Feb 20th, 2009| 07:48 pm | Posted by hlee

[stat.AP:0811.1663]
Open Statistical Issues in Particle Physics by Louis Lyons

My recollection of meeting Prof. L. Lyons was that he is very kind and listening. I was delighted to see his introductory article about particle physics and its statistical challenges from an [arxiv:stat] email subscription. Continue reading ‘[ArXiv] Particle Physics’ »

Tags: chi-square, chi-square minimization, coverage, hypothesis testing, L.Lyons, LHC, LRT, particle physics, posterior distribution
Category: arXiv, Bayesian, Cross-Cultural, Data Processing, Frequentist, High-Energy, Methods, Physics, Stat | Comment

accessing data, easier than before but…

Jan 20th, 2009| 01:59 pm | Posted by hlee

Someone emailed me for globular cluster data sets I used in a proceeding paper, which was about how to determine the multi-modality (multiple populations) based on well known and new information criteria without binning the luminosity functions. I spent quite time to understand the data sets with suspicious numbers of globular cluster populations. On the other hand, obtaining globular cluster data sets was easy because of available data archives such as VizieR. Most data sets in charts/tables, I acquire those data from VizieR. In order to understand science behind those data sets, I check ADS. Well, actually it happens the other way around: check scientific background first to assess whether there is room for statistics, then search for available data sets. Continue reading ‘accessing data, easier than before but…’ »

Tags: archive, ascii, catalog, CDA, data analysis, data mining, database, Gator, globular cluster, inference, massive data, multimodality, multiple populations, NED, SDSS, statistical inference, statistician, streaming data, table, tabulated, visieR
Category: Algorithms, Astro, Cross-Cultural, Data Processing, Jargon, Meta, Nuggets, Objects | 3 Comments

Likelihood Ratio Technique

Jan 15th, 2009| 06:01 pm | Posted by hlee

I wonder what Fisher, Neyman, and Pearson would say if they see “Technique” after “Likelihood Ratio” instead of “Test.” A presenter’s saying “Likelihood Ratio Technique” for source identification, I couldn’t resist checking it out not to offend founding fathers of the likelihood principle in statistics since “Technique” sounded derogatory to be attached with “Likelihood” to my ears. I thank, above all, the speaker who kindly gave me the reference about this likelihood ratio technique. Continue reading ‘Likelihood Ratio Technique’ »

Tags: Fisher, likelihood principle, likelihood ratio technique, likelihood ratio test, Neyman, Pearson
Category: Algorithms, arXiv, Astro, Bayesian, Cross-Cultural, Data Processing, Fitting, Frequentist, Jargon, Methods, Objects, Stat, Uncertainty | Comment

It bothers me.

Nov 17th, 2008| 01:39 pm | Posted by hlee

The full description is given http://cxc.harvard.edu/ciao3.4/ahelp/bayes.html about “bayes” under sherpa/ciao^[1]. Some sentences kept bothering me and here’s my account for the reason given outside of quotes. Continue reading ‘It bothers me.’ »

Note that the current sherpa is beta under ciao 4.0 not under ciao 3.4 and a description about “bayes” from the most recent sherpa is not available yet, which means this post needs updates one new release is available[↩]

Tags: bayes, ciao, ML, Sherpa
Category: Algorithms, Astro, Cross-Cultural, Data Processing, Fitting, High-Energy, Jargon, Languages, Methods, Spectral, Uncertainty, X-ray | 4 Comments

after “Thanks to Henrietta Leavitt”

Nov 6th, 2008| 11:22 pm | Posted by hlee

Personally, it was a highly anticipated symposium at CfA because I was fascinated about the female computers’ (or astronomers’) contributions that occurred here about a century ago even though at that time women were not considered as scientists but mere assistants for tedious jobs. Continue reading ‘after “Thanks to Henrietta Leavitt”’ »

Category: Astro, Bad AstroStat, Data Processing, Methods, Stars, Stat | Comment