Oct 15th, 2009| 06:46 pm | Posted by hlee

Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists^{[1]}. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »

Tags:

ANCOVA,

ANOVA,

approximation,

correlation,

Gaussianity,

graphics,

MADS,

modeling,

nonparametric,

parallel coordinates,

PCA,

quality,

quantity,

regression,

scatter plots Category:

arXiv,

Cross-Cultural,

Fitting,

Jargon,

Methods,

Stat,

Uncertainty |

Comment
Sep 8th, 2009| 10:17 am | Posted by hlee

I happened to observe a surge of principle component analysis (PCA) and independent component analysis (ICA) applications in astronomy. The PCA and ICA is used for separating mixed components with some assumptions. For the PCA, the decomposition happens by the assumption that original sources are orthogonal (uncorrelated) and mixed observations are approximated by multivariate normal distribution. For ICA, the assumptions is sources are independent and not gaussian (it grants one source component to be gaussian, though). Such assumptions allow to set dissimilarity measures and algorithms work toward maximize them. Continue reading ‘[ArXiv] component separation methods’ »

Mar 9th, 2009| 05:18 pm | Posted by hlee

It bears the name of its inventor, Prasanta Chandra Mahalanobis. As opposed to the Euclidean distance, a household name, the name of this distance is rarely used but many pseudonyms exist with variations adapted into broad scientific disciplines and applications. Therefore, under different names, I believe that the Mahalanobis distance is frequently applied in exploring and analyzing astronomical data. Continue reading ‘[MADS] Mahalanobis distance’ »

Jun 16th, 2008| 10:47 am | Posted by hlee

As Prof. Speed said, PCA is prevalent in astronomy, particularly this week. Furthermore, a paper explicitly discusses R, a popular statistics package. Continue reading ‘[ArXiv] 2nd week, June 2008’ »

Tags:

Bayesian evidence,

Binning,

broken power law,

cosmology,

K-S test,

LF,

lhs,

likelihood,

PCA,

power spectrum,

R,

SFH,

Sun,

Tully-Fisher Category:

arXiv,

MCMC |

Comment
May 5th, 2008| 03:08 am | Posted by hlee

Since I learned Hubble’s tuning fork^{[1]} for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »

Tags:

ANN,

automation,

Classification,

correlation function,

denoising,

FFT,

gravitational wave,

lensing,

LISA,

machine learning,

missing data,

mock data,

morphology,

PCA,

power spectrum,

robust,

SDSS,

spectrum,

sunspots,

wavelet,

zoo Category:

arXiv,

Galaxies,

Imaging,

MCMC,

Physics,

Spectral |

Comment
Apr 18th, 2008| 01:38 pm | Posted by hlee

Prof. Speed writes columns for IMS Bulletin and the April 2008 issue has Terence’s Stuff: PCA (p.9). Here are quotes with minor paraphrasing:

Although a quintessentially statistical notion, my impression is that PCA has always been more popular with non-statisticians. Of course we love to prove its optimality properties in our courses, and at one time the distribution theory of sample covariance matrices was heavily studied.

…but who could not feel suspicious when observing the explosive growth in the use of PCA in the biological and physical sciences and engineering, not to mention economics?…it became the analysis tool of choice of the hordes of former physicists, chemists and mathematicians who unwittingly found themselves having to be statisticians in the computer age.

My initial theory for its popularity was simply that they were in love with the prefix eigen-, and felt that anything involving it acquired the cachet of quantum mechanics, where, you will recall, everything important has that prefix.

He gave the following eigen-’s: eigengenes, eigenarrays, eigenexpression, eigenproteins, eigenprofiles, eigenpathways, eigenSNPs, eigenimages, eigenfaces, eigenpatterns, eigenresult, and even eigenGoogle.

How many miracles must one witness before becoming a convert?…Well, I’ve seen my three miracles of exploratory data analysis, examples where I found I had a problem, and could do something about it using PCA, so now I’m a believer.

No need to mention that astronomers explore data with PCA and utilize eigen- values and vectors to transform raw data into more interpretable ones.

Mar 30th, 2008| 07:51 pm | Posted by hlee

The numbers of astro-ph preprints on average have been decreased so as my hours of reading abstracts…. cool!!! By the way, there is a paper about solar cycle, PCA, ICA, and Lomb-Scargle periodogram. Continue reading ‘[ArXiv]4th week, Mar. 2008’ »

Nov 2nd, 2007| 05:59 pm | Posted by hlee

To be exact, the title of this posting should contain *5th week, Oct*, which seems to be the week of EGRET. In addition to astro-ph papers, although they are not directly related to astrostatistics, I include a few statistics papers which may be profitable for astronomical data analysis. Continue reading ‘[ArXiv] 1st week, Nov. 2007’ »

Tags:

bootstrap,

EGRET,

Fisher information,

Laplace transform,

maximum likelihood,

PCA,

PDF,

Poisson,

Ratio,

Uncertainty,

variance Category:

arXiv |

1 Comment
Jul 11th, 2007| 11:50 am | Posted by vlk

Hyunsook and I have preliminary findings (work done with the help of the X-Atlas group) on the efficacy of using spectral proxies to classify low-mass coronal sources, put up as a poster at the XGratings workshop. The workshop has a “poster haiku” session, where one may summarize a poster in a single transparency and speak on it for a couple of minutes. I cannot count syllables, so I wrote a limerick instead: Continue reading ‘Summarizing Coronal Spectra’ »

Tags:

2007,

dendrograms,

limerick,

PCA,

workshop,

X-Atlas,

XAtlas,

XGratings Category:

Astro,

News,

Quotes,

Spectral,

Stars,

X-ray |

Comment
Jul 2nd, 2007| 06:07 pm | Posted by hlee

From arXiv/astro-ph:0706.4484

Spectroscopic Surveys: Present by Yip. C. overviews recent spectroscopic sky surveys and spectral analysis techniques toward Virtual Observatories (VO). In addition that spectroscopic redshift measures increase like Moore’s law, the surveys tend to go deeper and aim completeness. Mainly elliptical galaxy formation has been studied due to more abundance compared to spirals and the galactic bimodality in color-color or color-magnitude diagrams is the result of the gas-rich mergers by blue mergers forming the red sequence. Principal component analysis has incorporated ratios of emission line-strengths for classifying Type-II AGN and star forming galaxies. Lyα identifies high z quasars and other spectral patterns over z reveal the history of the early universe and the characteristics of quasars. Also, the recent discovery of 10 satellites to the Milky Way is mentioned.

Continue reading ‘[ArXiv] Spectroscopic Survey, June 29, 2007’ »

Tags:

bimodality,

chi-square minimization,

Classification,

CMD,

Estimation,

machine learning,

massive data,

model based,

PCA,

spectral analysis,

spectroscopic,

survey,

VO Category:

arXiv,

Astro,

Bayesian,

Data Processing,

Fitting,

Frequentist,

Methods,

Spectral |

Comment