Oct 22nd, 2009| 07:08 pm | Posted by hlee

[arXiv:stat.ME:0910.2585]

Variable Selection and Updating In Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications

by *Murphy, Dean, and Raftery*

Classifying or clustering (or semi supervised learning) spectra is a very challenging problem from collecting statistical-analysis-ready data to reducing the dimensionality without sacrificing complex information in each spectrum. Not only how to estimate spiky (not differentiable) curves via statistically well defined procedures of estimating equations but also how to transform data that match the regularity conditions in statistics is challenging.

Continue reading ‘[ArXiv] classifying spectra’ »

Tags:

BIC,

Classification,

clustering,

cross-validation,

curse of dimensionality,

discriminant analysis,

graphical model,

mclust,

model based,

semi-supervised learning,

statistical learning,

variable selection Category:

Algorithms,

arXiv,

Cross-Cultural,

Data Processing,

Jargon,

Methods,

Spectral,

Stat |

Comment
Oct 6th, 2009| 08:30 pm | Posted by hlee

Tags:

Classification,

clustering,

factor analysis,

Hubble,

multivariate analysis,

principle component analysis,

SING,

Spitzer,

tuning fork Category:

Algorithms,

Astro,

Cross-Cultural,

Data Processing,

Galaxies,

Jargon,

Methods,

Objects,

Stars,

Stat |

Comment
Jul 29th, 2009| 01:02 am | Posted by hlee

Speaking of XAtlas from my previous post I tried another visualization tool called **Parallel Coordinates** on these Capella observations and two stars with multiple observations (AL Lac and IM Peg). As discussed in [MADS] Chernoff face, full description of the catalog is found from XAtlas website. The reason for choosing these stars is that among low mass stars, next to Capella (I showed 16), IM PEG (HD 21648, 8 times), and AR Lac (although different phases, 6 times) are most frequently observed. I was curious about which variation, within (statistical variation) and between (Capella, IM Peg, AL Lac), is dominant. How would they look like from the parametric space of High Resolution Grating Spectroscopy from Chandra? Continue reading ‘[MADS] Parallel Coordinates’ »

Tags:

Classification,

clustering,

display,

EDA,

eye catcher,

GGobi,

Inselberg,

parallel coordinates,

visualization Category:

Algorithms,

arXiv,

Cross-Cultural,

Data Processing,

High-Energy,

Jargon,

Methods,

Spectral,

X-ray |

3 Comments
Sep 18th, 2008| 07:48 pm | Posted by hlee

Another deduced conclusion from reading preprints listed in arxiv/astro-ph is that astronomers tend to confuse **classification and clustering** and to mix up methodologies. They tend to think any algorithms from classification or clustering analysis serve their purpose since both analysis algorithms, no matter what, look like a **black box**. I mean a black box as in neural network, which is one of classification algorithms. Continue reading ‘Classification and Clustering’ »

Tags:

black box,

book,

catalog,

Classification,

clustering,

haste,

outliers,

R,

Robert Serfling,

semi-supervised learning,

survey Category:

Algorithms,

arXiv,

Astro,

Bad AstroStat,

Cross-Cultural,

Data Processing,

Frequentist,

Jargon,

Methods,

Stat |

Comment
Jun 19th, 2008| 11:42 pm | Posted by hlee

I was questioned by two attendees, acquainted before the AAS, if I can suggest them clustering methods relevant to their projects. After all, we spent quite a time to clarify the term **clustering.** Continue reading ‘my first AAS. IV. clustering’ »

May 19th, 2008| 10:42 am | Posted by hlee

There’s no particular opening remark this week. Only I have profound curiosity about jackknife tests in [astro-ph:0805.1994]. Including this paper, a few deserve separate discussions from a statistical point of view that shall be posted. Continue reading ‘[ArXiv] 2nd week, May 2008’ »

Tags:

bimodality,

bootstrap,

calibration uncertainty,

CF,

Classification,

CMB,

dip,

exoplanet,

Fisher matrix,

flare,

GL,

jackknife,

KS test,

marked point,

maximum likelihood,

MLE,

poisson point process,

spatial data,

XLF Category:

arXiv,

Frequentist,

Uncertainty,

X-ray |

Comment
May 11th, 2008| 10:42 pm | Posted by hlee

I think I have to review spatial statistics in astronomy, focusing on tessellation (void structure), point process (expanding 2 (3) point correlation function), and marked point process (spatial distribution of hardness ratios of X-ray distant sources, different types of galaxies -not only morphological differences but other marks such as absolute magnitudes and existence of particular features). When? Someday…

In addition to Bayesian methodologies, like this week’s astro-ph, studies on characterizing empirical spatial distributions of voids and galaxies frequently appear, which I believe can be enriched further with the ideas from stochastic geometry and spatial statistics. Click for what was appeared in arXiv this week. Continue reading ‘[ArXiv] 1st week, May 2008’ »

Tags:

Classification,

covariance,

FARIMA,

Fisher information,

GL,

GRB,

Levy,

light curve,

limb darkening,

ML,

Pareto distribution,

quasars,

solar flare,

standard candle,

tessellation,

time series,

VO,

void Category:

arXiv,

MCMC,

Uncertainty |

1 Comment
May 5th, 2008| 03:08 am | Posted by hlee

Since I learned Hubble’s tuning fork^{[1]} for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »

Tags:

ANN,

automation,

Classification,

correlation function,

denoising,

FFT,

gravitational wave,

lensing,

LISA,

machine learning,

missing data,

mock data,

morphology,

PCA,

power spectrum,

robust,

SDSS,

spectrum,

sunspots,

wavelet,

zoo Category:

arXiv,

Galaxies,

Imaging,

MCMC,

Physics,

Spectral |

Comment
Apr 11th, 2008| 02:21 am | Posted by hlee

Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions. Continue reading ‘[ArXiv] 2nd week, Apr. 2008’ »

Tags:

Classification,

GRB,

Hubble constant,

K-S test,

kurtosis,

mask,

maximum likelihood,

SDSS,

skewness,

Solar Oscillation,

Vicent Martinez Category:

arXiv,

Bayesian,

MCMC,

Methods,

Stat |

3 Comments
Mar 14th, 2008| 03:44 pm | Posted by hlee

Warning! The list is long this week but diverse. Some are of CHASC’s obvious interest. Continue reading ‘[ArXiv] 2nd week, Mar. 2008’ »

Tags:

ANN,

autocorrelation,

Classification,

cross-correlation,

Estimation,

Fisher information,

lensing,

LF,

Model Selection,

Pareto,

signal processing,

tessellation Category:

arXiv,

MCMC |

Comment
Feb 24th, 2008| 09:56 pm | Posted by hlee

It seems like I omit papers deserving attentions from time to time. If you find one, please leave a message. Even better if a summary can be left for a separate posting. Continue reading ‘[ArXiv] 3rd week, Feb. 2008’ »

Jan 11th, 2008| 03:44 pm | Posted by hlee

It is notable that there’s an astronomy paper contains **AIC, BIC**, and **Bayesian evidence** in the title. The topic of the paper, unexceptionally, is cosmology like other astronomy papers discussed these (statistical) information criteria (I only found a couple of papers on model selection applied to astronomical data analysis without articulating CMB stuffs. Note that I exclude Bayes factor for the model selection purpose).

To find the paper or other interesting ones, click Continue reading ‘[ArXiv] 2nd week, Jan. 2007’ »

Tags:

AIC,

Bayesian evidence,

BIC,

catalog,

Classification,

CMB,

confidence interval,

consistency,

correlation,

GRB,

information criterion,

Model Selection,

SDSS,

test,

WMAP Category:

arXiv |

Comment
Dec 21st, 2007| 02:40 pm | Posted by hlee

The paper about the Banff challenge [0712.2708] and the statistics tutorial for cosmologists [0712.3028] are the personal recommendations from this week’s [arXiv] list. Especially, I’d like to quote from Licia Verde’s [astro-ph:0712.3028],

In general, Cosmologists are Bayesians and High Energy Physicists are Frequentists.

I thought it was opposite. By the way, if you crave for more papers, click Continue reading ‘[ArXiv] 3rd week, Dec. 2007’ »

Oct 6th, 2007| 12:45 pm | Posted by hlee

This week, instead of only filtering AstroStatistics related papers from arxiv, I chose additional arxiv/astro-ph papers related to CHASC folks’ astrophysical projects. Some of papers you see from this week do not have sophisticated statistical analysis but contain data from specific satellites and possibly relevant information related to CHASC projects. Due to the CHACS’ long history (we are celebrating the 10th birthday this year) and my being a newbie to CHASC, I may not pick up all papers related to the projects of current, former, and future CHASC members and dedicated slog readers. For creating a satisfying posting every week, your inputs are welcome to improve my adaptive filter. For the list of this week, click the following.

Continue reading ‘[ArXiv] 1st week, Oct. 2007’ »

Tags:

Binning,

CHASC,

Classification,

EGRET,

GLAST,

Globular Clusters,

IMF,

LMC,

NGC 346,

Supernova Spectrum,

Upper Limits Category:

arXiv |

Comment
Oct 5th, 2007| 04:47 pm | Posted by hlee

Not knowing much about java and java applets in a software development and its web/internet publicizing, I cannot comment what is more efficient. Nevertheless, I thought that PHP would do the similar job in a simpler fashion and the followings may provide some ideas and solutions for publicizing statistical methods through websites based on Bayesian Inference.

Continue reading ‘Implement Bayesian inference using PHP’ »

Tags:

Bayesian Inference,

Classification,

Condition Probability,

Estimation,

IBM,

JAVA,

Open Source,

PHP Category:

Algorithms,

Bayesian,

Cross-Cultural,

Data Processing,

Languages |

Comment