Last Updated: 2006apr26

CHASC

Topics in Astrostatistics

Statistics 310, Fall/Winter 2005-2006

Harvard University

www.courses.fas.harvard.edu/~stat310/

Instructor	Prof. Meng Xiao Li
Schedule	Tuesdays 10 AM - 11:30 AM
Location	Science Center Rm 705

Presentations
Fall/Winter 2004-2005
David van Dyk (UC Irvine) 20 Sep 2005	Introduction
Michael Ratner (CfA) Liu Jing Chen (Harvard U) 04 Oct 2005	On locating IM Peg for Gravity Probe B
Jin Jia Shun (Purdue U) 11 Oct 2005	Higher Criticism Statistic: Theory and Applications in Cosmology and Astronomy [.pdf] Abstract [.pdf] Of interest: Higher Criticism for Detecting Sparse Heterogeneous Mixtures -- Donoho, D., & Jin, J., Ann. Statist., Vol 32, 3, 962-994 Cosmological non-Gaussian Signature Detection: Comparing Performance of Different Statistical Tests -- Jin et al. 2005, astro-ph/0503374 -- [.pdf]
Park Tae Young (Harvard U) 25 Oct 2005	Fitting Narrow Emission Lines in X-ray Spectra [.pdf] Abstract: Spectral emission lines are local features that represent extra emissions of photons in a narrow band of energy. In a statistical model, it is often appropriate to model the emission lines with a narrow Gaussian function or a delta function. In this article, we show how to identify the location of the narrow line profiles using a model-based Bayesian statistical perspective. Such Bayesian methods are ideally suited to handling the complexity of high-resolution high-energy spectral data such as that obtained with the Chandra X-ray Observatory. van Dyk et al (2001) show how Bayesian methods can account for these complexities of the data generation mechanism as well as the Poisson nature of photon count data. The multimodal nature of the likelihood function poses difficulties for these methods, however, when the location and width of a spectral line are simultaneously fitted or when delta functions are used to model spectral lines. These difficulties necessitate more sophisticated, state-of-the-art statistical computation. We thus develop such methods and illustrate how to detect narrow spectral lines in X-ray spectra using Chandra data sets for the energy spectrum of the high redshift quasar PG 1634+706.
Chandra Calibration Workshop 01 Nov 2005 1:30pm-4:30pm	Special Session on Incorporating Calibration Uncertainties into Data Analysis http://cxc.harvard.edu/ccw/
Andreas Zezas (SAO) 29 Nov 2005	X-ray data analysis techniques Presentation [.ppt]
Hong Jae Sub (SAO) 7 Feb 2006 12 Noon - 1 PM	New spectral classification technique for faint X-ray sources: Quantile Analysis [.ppt] Abstract: We describe a new spectral classification technique called quantile analysis for X-ray sources with limited statistics. The quantile analysis is superior to the conventional approaches such as X-ray hardness ratios or X-ray color analysis. The median is considered to be an improved substitute for the conventional X-ray hardness ratio and the quantile-based phase diagram is more evenly sensitive over various spectral shapes than the conventional color-color diagrams. We demonstrate the new technique by simulations using Chandra ACIS detector response function and the analysis results from the deep observations at the galactic center. Links: astro-ph/0406463 QCCD code ChaMPlane
Aneta Siemiginowska (SAO) & Vinay Kashyap (SAO) 8 Feb 2006 12:30 Noon - 1:30 PM HEAD Lunch Talk	X-ray Astrostatistics: Bayesian Methods in Data Analysis Abstract: We will describe the California-Harvard AstroStatistics Collaboration, CHASC. We will provide an introduction to Bayesian methods in the context of some basic X-ray astrophysics problems, such as determining the source strength in the presence of background, and hardness ratios in the regime of (very) low counts. We will also discuss posterior predictive p-values (PPP), which are the preferred alternatives to the often abused F-tests used for model comparisons. AS's slides: [.ppt] ; [.pdf] VK's slides: [.ppt] ; [.pdf]
Meng Xiao-Li (Harvard U) 25 Apr 2006 11am-Noon	A Brief Tutorial of Markov Chain Monte Carlo: A Workhorse for Modern Scientific Computation Abstract: The Markov chain Monte Carlo (MCMC) methods, originating in computational physics about half a century ago, have seen an enormous range of applications in recent statistical literature, due to their ability to simulate from very complex distributions such as the ones needed in realistic statistical models. This talk provides an introductory tutorial of the two most frequently used MCMC algorithms: the Gibbs sampler and the Metropolis-Hastings algorithm. Using simple yet non-trivial examples, we show, step by step, how to implement these two algorithms. The examples involve a family of bivariate distributions whose full conditional distributions are all normal but whose joint densities are not only non-normal, but also bimodal. Presentation: [.ppt] ; [.pdf] Movies: symmetric, Gibbs [.avi] asymmetric, Gibbs, bad implementation [.avi] asymmetric, Gibbs, better implementation [.avi]
Hyunsook Lee (Penn State) 7 Sep 2006	A Convex Hull Peeling Depth Approach to Nonparametric Massive Multivariate Data Analysis with Applications Abstract: We explore the convex hull peeling process to develop empirical tools for statistical inferences on multivariate massive data. Convex hull and its peeling process has intuitive appeals for robust location estimation. We define the convex hull peeling depth, which enables to order multivariate data. This ordering process provides ways to obtain multivariate quantiles including median. Based on the generalized quantile process, we define a convex hull peeling central region, a convex hull level set, and a volume functional, which lead us to invent one dimensional mappings, describing shapes of multivariate distributions along data depth. We define empirical skewness and kurtosis measures based on the convex hull peeling process. In addition to these empirical descriptive statistics, we find a few methodologies to separate multivariate outliers in massive data sets. Those outlier detection algorithms are (1) estimating multivariate quantiles up to the level $\alpha$, (2) detecting changes in a measure sequence of convex hull level sets, and (3) constructing a balloon to exclude outliers. The convex hull peeling depth is a robust estimator so that the existence of outliers do not affect properties of inner convex hull level sets. Overall, we illustrate all these characteristics and algorithms of the convex hull peeling process through bivariate synthetic data sets. We show that these empirical procedures are applicable to real massive data set by employing Quasars and galaxies from the Sloan Digital Sky Survey. Presentation [.pdf]