Archive for the ‘Methods’ Category.

Sep 9th, 2011| 01:05 pm | Posted by vlk

We organized a Special Session on *Time Series in High Energy Astrophysics: Techniques Applicable to Multi-Dimensional Analysis* on Sep 7, 2011, at the AAS-HEAD conference at Newport, RI. The talks presented at the session are archived at http://hea-www.harvard.edu/AstroStat/#head2011

A tremendous amount of information is contained within the temporal variations of various measurable quantities, such as the energy distributions of the incident photons, the overall intensity of the source, and the spatial coherence of the variations. While the detection and interpretation of periodic variations is well studied, the same cannot be said for non-periodic behavior in a multi-dimensional domain. Methods to deal with such problems are still primitive, and any attempts at sophisticated analyses are carried out on a case-by-case basis. Some of the issues we seek to focus on are methods to deal with are:

* Stochastic variability

* Chaotic Quasi-periodic variability

* Irregular data gaps/unevenly sampled data

* Multi-dimensional analysis

* Transient classification

Our goal is to present some basic questions that require sophisticated temporal analysis in order for progress to be made. We plan to bring together astronomers and statisticians who are working in many different subfields so that an exchange of ideas can occur to motivate the development of sophisticated and generally applicable algorithms to astronomical time series data. We will review the problems and issues with current methodology from an algorithmic and statistical perspective and then look for improvements or for new methods and techniques.

Tags:

2011,

AAS,

HEAD,

September,

special session,

Timing Analysis Category:

Astro,

CHASC,

Data Processing,

Methods,

News,

Optical,

Stat,

Timing |

Comment
Jul 22nd, 2010| 09:25 am | Posted by hlee

Tags:

book,

Brieman,

cigar,

Clinton,

data mining,

Friedman,

Hastie,

KDD,

light curve,

machine learning,

SCMA,

shaking hands,

SN,

statistical learning,

Supernova,

Tibshirani Category:

Algorithms,

Cross-Cultural,

High-Energy,

Jargon,

Methods,

Quotes,

Stat,

Uncertainty |

Comment
Jun 21st, 2010| 12:25 pm | Posted by chasc

**mini-Workshop on Computational Astro-statistics: Challenges and Methods for Massive Astronomical Data**

*Aug 24-25, 2010*

Phillips Auditorium, CfA,

60 Garden St., Cambridge, MA 02138

URL: http://hea-www.harvard.edu/AstroStat/CAS2010

Continue reading ‘mini-Workshop on Computational AstroStatistics [announcement]’ »

Tags:

2010,

Announcement,

astronomical data,

astrostatistics,

Aug 2010,

August,

Computational,

workshop Category:

Astro,

CHASC,

gamma-ray,

Methods,

News,

Optical,

Stat,

X-ray |

Comment
Dec 27th, 2009| 10:13 pm | Posted by hlee

I often feel irksome whenever I see a function being normalized over a feasible parameter space and it being used as a probability density function (pdf) for further statistical inference. In order to be a suitable pdf, normalization has to be done over a measurable space not over a feasible space. Such practice often yields biased best fits (biased estimators) and improper error bars. On the other hand, validating a measurable space under physics seems complicated. To be precise, we often lost in translation. Continue reading ‘A short note on Probability for astronomers’ »

Tags:

axiom,

curriculum,

education,

google university,

hope,

measurable,

probability Category:

Algorithms,

arXiv,

Cross-Cultural,

Jargon,

Methods,

Quotes,

Stat,

Uncertainty |

Comment
Dec 20th, 2009| 07:27 pm | Posted by hlee

Tags:

assumptions,

bulletin,

IMS,

maximum likelihood,

MLE,

T. Speed Category:

arXiv,

Cross-Cultural,

Fitting,

Frequentist,

Methods,

Stat |

1 Comment
Nov 21st, 2009| 05:06 am | Posted by hlee

by Emanuel Parzen in * Statistical Science* 2004, Vol 19(4), pp.652-662 JSTOR

I teach that statistics (done the quantile way) can be simultaneously frequentist and Bayesian, confidence intervals and credible intervals, parametric and nonparametric, continuous and discrete data. My first step in data modeling is identification of parametric models; if they do not fit, we provide nonparametric models for fitting and simulating the data. The practice of statistics, and the modeling (mining) of data, can be elegant and provide intellectual and sensual pleasure. Fitting distributions to data is an important industry in which statisticians are not yet vendors. We believe that unifications of statistical methods can enable us to advertise, “What is your question? Statisticians have answers!”

I couldn’t help liking this paragraph because of its bitter-sweetness. I hope you appreciate it as much as I did.

Nov 13th, 2009| 04:46 pm | Posted by hlee

I was told to stay away from python and I’ve obeyed the order sincerely. However, I collected the following stuffs several months back at the instance of hearing about import inference and I hate to see them getting obsolete. At that time, collecting these modules and getting through them could help me complete the first step toward the quest Learning Python (the first posting of this slog). Continue reading ‘some python modules’ »

Tags:

APLpy,

AstroPy,

IDLsave,

import inference,

libraries,

modules,

package,

Pyfits,

PyMC,

PyRAF,

PYSTAT,

Python,

PyWavelets Category:

Algorithms,

Astro,

Cross-Cultural,

Data Processing,

Jargon,

Languages,

Methods,

News,

Stat |

2 Comments
Oct 28th, 2009| 09:29 am | Posted by hlee

As a part of exploring spatial distribution of particles/objects, not to approximate via Poisson process or Gaussian process (parametric), nor to impose hypotheses such as homogenous, isotropic, or uniform, various **nonparametric** methods somewhat dragged my attention for data exploration and preliminary analysis. Among various nonparametric methods, the one that I fell in love with is tessellation (state space approaches are excluded here). Computational speed wise, I believe tessellation is faster than kernel density estimation to estimate level sets for multivariate data. Furthermore, conceptually constructing polygons from tessellation is intuitively simple. However, coding and improving algorithms is beyond statistical research (check books titled or key-worded partially by **computational geometry**). Good news is that for computation and getting results, there are some freely available softwares, packages, and modules in various forms. Continue reading ‘[ArXiv] Voronoi Tessellations’ »

Tags:

data compression,

delanay tessellation,

density estimation,

image processing,

nonparametric,

spatial statistics,

van de Weygaert,

van Lieshout,

voronoi tessellation Category:

Algorithms,

arXiv,

Galaxies,

Methods |

Comment
Oct 22nd, 2009| 07:08 pm | Posted by hlee

[arXiv:stat.ME:0910.2585]

Variable Selection and Updating In Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications

by *Murphy, Dean, and Raftery*

Classifying or clustering (or semi supervised learning) spectra is a very challenging problem from collecting statistical-analysis-ready data to reducing the dimensionality without sacrificing complex information in each spectrum. Not only how to estimate spiky (not differentiable) curves via statistically well defined procedures of estimating equations but also how to transform data that match the regularity conditions in statistics is challenging.

Continue reading ‘[ArXiv] classifying spectra’ »

Tags:

BIC,

Classification,

clustering,

cross-validation,

curse of dimensionality,

discriminant analysis,

graphical model,

mclust,

model based,

semi-supervised learning,

statistical learning,

variable selection Category:

Algorithms,

arXiv,

Cross-Cultural,

Data Processing,

Jargon,

Methods,

Spectral,

Stat |

Comment
Oct 15th, 2009| 06:46 pm | Posted by hlee

Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists^{[1]}. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »

Tags:

ANCOVA,

ANOVA,

approximation,

correlation,

Gaussianity,

graphics,

MADS,

modeling,

nonparametric,

parallel coordinates,

PCA,

quality,

quantity,

regression,

scatter plots Category:

arXiv,

Cross-Cultural,

Fitting,

Jargon,

Methods,

Stat,

Uncertainty |

Comment
Oct 6th, 2009| 08:30 pm | Posted by hlee

Tags:

Classification,

clustering,

factor analysis,

Hubble,

multivariate analysis,

principle component analysis,

SING,

Spitzer,

tuning fork Category:

Algorithms,

Astro,

Cross-Cultural,

Data Processing,

Galaxies,

Jargon,

Methods,

Objects,

Stars,

Stat |

Comment
Oct 6th, 2009| 01:49 pm | Posted by hlee

When it comes to applying statistics for measuring goodness-of-fit, the Pearson χ^{2} test is the dominant player in a race and the Kolmogorov-Smirnoff test statistic trails far behind. Although it seems almost invisible in this race, there are more various non-parametric statistics for testing goodness-of-fit and for comparing the sampling distribution to a reference distribution as legitimate race participants trained by many statisticians. Listing their names probably useful to some astronomers when they find the underlying assumptions for the χ^{2} test do not match the data. Perhaps, some astronomers want to try other nonparametric test statistics other than the K-S test. I’ve seen other test statistics in astronomical journals from time to time. Depending on data and statistical properties, one test statistic could work better than the other; therefore, it’s worthwhile to keep the variety in one’s mind that there are other tests beyond the χ^{2} test goodness-of-fit test statistic. Continue reading ‘Goodness-of-fit tests’ »

Sep 11th, 2009| 03:40 pm | Posted by hlee

A number of practical Bayesian data analysis books are available these days. Here, I’d like to introduce two that were relatively recently published. I like the fact that they are rather technical than theoretical. They have practical examples close to be related with astronomical data. They have R codes so that one can try algorithms on the fly instead of jamming probability theories. Continue reading ‘[Books] Bayesian Computations’ »

Tags:

book,

BUGS,

CMB,

examples,

HMM,

identifiability,

image processing,

LLN,

mixture,

MRF,

R Category:

Bayesian,

Fitting,

Languages,

MC,

MCMC,

Methods,

Stat |

1 Comment
Sep 4th, 2009| 01:30 pm | Posted by hlee

ARCH (**autoregressive conditional heteroscedasticity**) is a statistical model that considers *the variance of the current error term to be a function of the variances of the previous time periods’ error terms*. I heard that this model made Prof. Engle a Nobel prize recipient. Continue reading ‘[MADS] ARCH’ »

Sep 1st, 2009| 07:43 pm | Posted by hlee

[arxiv:0906.3662] **The Statistical Analysis of fMRI Data** by Martin A. Lindquist

Statistical Science, Vol. 23(4), pp. 439-464

This review paper offers some information and guidance of statistical image analysis for fMRI data that can be expanded to astronomical image data. I think that fMRI data contain similar challenges of astronomical images. As Lindquist said, collaboration helps to find shortcuts. I hope that introducing this paper helps further networking and collaboration between statisticians and astronomers.

**List of similarities** Continue reading ‘[ArXiv] Statistical Analysis of fMRI Data’ »

Tags:

data aquisition,

experimental design,

fMRI,

ICA,

image analysis,

image processing,

localization,

modeling,

pipeline,

preprocessing,

similarities,

Spatial,

temporal,

time series,

voxel Category:

arXiv,

Cross-Cultural,

Data Processing,

Imaging,

Jargon,

Methods,

Stat |

Comment