Jan 20th, 2009| 01:59 pm | Posted by hlee

Someone emailed me for globular cluster data sets I used in a proceeding paper, which was about how to determine the multi-modality (multiple populations) based on well known and new information criteria without binning the luminosity functions. I spent quite time to understand the data sets with suspicious numbers of globular cluster populations. On the other hand, obtaining globular cluster data sets was easy because of available data archives such as VizieR. Most data sets in charts/tables, I acquire those data from VizieR. In order to understand science behind those data sets, I check ADS. Well, actually it happens the other way around: check scientific background first to assess whether there is room for statistics, then search for available data sets. Continue reading ‘accessing data, easier than before but…’ »

Tags:

archive,

ascii,

catalog,

CDA,

data analysis,

data mining,

database,

Gator,

globular cluster,

inference,

massive data,

multimodality,

multiple populations,

NED,

SDSS,

statistical inference,

statistician,

streaming data,

table,

tabulated,

visieR Category:

Algorithms,

Astro,

Cross-Cultural,

Data Processing,

Jargon,

Meta,

Nuggets,

Objects |

3 Comments
May 5th, 2008| 03:08 am | Posted by hlee

Since I learned Hubble’s tuning fork^{[1]} for the first time, I wanted to do classification (semi-supervised learning seems more suitable) galaxies based on their features (colors and spectra), instead of labor intensive human eye classification. Ironically, at that time I didn’t know there is a field of computer science called machine learning nor statistics which do such studies. Upon switching to statistics with a hope of understanding statistical packages implemented in IRAF and IDL, and learning better the contents of Numerical Recipes and Bevington’s book, the ignorance was not the enemy, but the accessibility of data was. Continue reading ‘[ArXiv] 5th week, Apr. 2008’ »

Tags:

ANN,

automation,

Classification,

correlation function,

denoising,

FFT,

gravitational wave,

lensing,

LISA,

machine learning,

missing data,

mock data,

morphology,

PCA,

power spectrum,

robust,

SDSS,

spectrum,

sunspots,

wavelet,

zoo Category:

arXiv,

Galaxies,

Imaging,

MCMC,

Physics,

Spectral |

Comment
Apr 27th, 2008| 11:29 am | Posted by hlee

The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week. Continue reading ‘[ArXiv] 4th week, Apr. 2008’ »

Tags:

clusters,

CMB,

GALEX,

gravitaional waves,

lensing,

LF,

LMC,

machine learning,

maximum likelihood,

priors,

probability,

SDSS,

stellar populations,

sunspot,

time series Category:

arXiv,

MCMC |

Comment
Apr 20th, 2008| 09:05 pm | Posted by hlee

The dichotomy of outliers; detecting outliers to be discarded or to be investigated; statistics that is robust enough not to be influenced by outliers or sensitive enough to alert the anomaly in the data distribution. Although not related, one paper about outliers made me to dwell on what outliers are. This week topics are diverse. Continue reading ‘[ArXiv] 3rd week, Apr. 2008’ »

Tags:

background,

bootstrap,

calibration errors,

Cash statistics,

clusters,

CMB,

corona,

edge detection,

FFT,

gravitational lens,

maximum likelihood,

multiscale,

neural network,

outlier,

SDSS,

sunspot,

systematic errors,

topology,

WMAP,

XMM-Newton Category:

arXiv,

High-Energy,

MCMC |

Comment
Apr 11th, 2008| 02:21 am | Posted by hlee

Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions. Continue reading ‘[ArXiv] 2nd week, Apr. 2008’ »

Tags:

Classification,

GRB,

Hubble constant,

K-S test,

kurtosis,

mask,

maximum likelihood,

SDSS,

skewness,

Solar Oscillation,

Vicent Martinez Category:

arXiv,

Bayesian,

MCMC,

Methods,

Stat |

3 Comments
Jan 11th, 2008| 03:44 pm | Posted by hlee

It is notable that there’s an astronomy paper contains **AIC, BIC**, and **Bayesian evidence** in the title. The topic of the paper, unexceptionally, is cosmology like other astronomy papers discussed these (statistical) information criteria (I only found a couple of papers on model selection applied to astronomical data analysis without articulating CMB stuffs. Note that I exclude Bayes factor for the model selection purpose).

To find the paper or other interesting ones, click Continue reading ‘[ArXiv] 2nd week, Jan. 2007’ »

Tags:

AIC,

Bayesian evidence,

BIC,

catalog,

Classification,

CMB,

confidence interval,

consistency,

correlation,

GRB,

information criterion,

Model Selection,

SDSS,

test,

WMAP Category:

arXiv |

Comment
Dec 31st, 2007| 02:06 pm | Posted by hlee

This will be the last [ArXiv] of this year (for some of you, the previous year). Continue reading ‘The last [ArXiv] of 2007’ »

Dec 3rd, 2007| 08:58 pm | Posted by hlee

Astronomers are hard working people, day and night, weekend and weekdays, 24/7, etc. My vacation delayed this week’s posting, not astronomers nor statisticians. Continue reading ‘[ArXiv] 5th week, Nov. 2007’ »

Aug 30th, 2007| 09:36 pm | Posted by hlee

From arxiv/astro-ph:0708.3758v1

**Numerical Color-Magnitude Diagram Analysis of SDSS Data and Application to the New Milky Way Satellites** by J. T. A. de Jong et. al.

The authors applied MATCH (Dolphin, 2002^{[1]} -note that the year is corrected) to M13, M15, M92, NGC2419, NGC6229, and Pal14 (well known globular clusters), and BooI, BooII, CvnI, CVnII, Com, Her, LeoIV, LeoT, Segu1, UMaI, UMaII and Wil1 (newly discovered Milky Way satellites) from Sloan Digital Sky Survey (SDSS) to fit Color Magnitude diagrams (CMDs) of these stellar clusters and find the properties of these satellites.

Continue reading ‘[ArXiv] Numerical CMD analysis, Aug. 28th, 2007’ »

Tags:

CMD,

globular cluster,

isochrone,

Padova,

satellite,

SDSS,

synthetic Category:

arXiv,

Astro,

CHASC,

Fitting,

Quotes,

Stars |

Comment
Jul 25th, 2007| 01:46 pm | Posted by hlee

From arxiv/astro-ph:0707.3413

**The Sixth Data Release of the Sloan Digital Sky Survey** by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and

SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Continue reading ‘[ArXiv] SDSS DR6, July 23, 2007’ »

Tags:

catalog,

convex hull peeling,

density estimation,

DR6,

massive data,

multivariate analysis,

nonparametric,

SDSS,

SQL,

voronoi tessellation Category:

Algorithms,

arXiv,

Astro,

Data Processing,

Misc,

Optical |

1 Comment
Jun 25th, 2007| 01:27 pm | Posted by hlee

One of the papers from arxiv/astro-ph discusses kernel regression and model selection to determine photometric redshifts astro-ph/0706.2704. This paper presents their studies on choosing bandwidth of kernels via 10 fold cross-validation, choosing appropriate models from various combination of input parameters through estimating root mean square error and AIC, and evaluating their kernel regression to other regression and classification methods with root mean square errors from literature survey. They made a conclusion of flexibility in kernel regression particularly for data at high z.

Continue reading ‘[ArXiv] Kernel Regression, June 20, 2007’ »

Tags:

AIC,

BIC,

Classification,

cross-validation,

kernel,

photometric redshifts,

regression,

SDSS Category:

arXiv,

Frequentist,

Galaxies,

Stat |

Comment