AstroStat Talks 2012-2013
Last Updated: 2013may07


Topics in Astrostatistics

Statistics 310, Harvard University
Statistics 281, University of California, Irvine

AY 2012-2013

Instructor Prof. Meng Xiao Li (HU)
  Prof. David van Dyk (ICL)
  Prof. Yu Yaming (UCI)
Schedule Tuesdays Noon - 2:30PM ET
Location SciCen 706

Nathan Stein (Harvard U)
4 Sep 2012
Combining Computer Models to Account for Mass Loss in Stellar Evolution
Abstract: I will present a technique for inferring the so-called initial-final mass relation (IFMR), the mapping between the initial mass of a Sun-like star and its final mass as a white dwarf. Our model incorporates several separate computer models for various phases of stellar evolution. We bridge these computer models with a parameterized IFMR in order to embed them in a Bayesian statistical model. In contrast to traditional techniques for inferring the IFMR, which tend to be quite ad hoc, we can estimate the uncertainty in our fit and ensure that our model components are internally coherent. We analyze data from three star clusters: NGC 2477, the Hyades, and M35. The results from NGC 2477 and M35 suggest different conclusions about the IFMR in the mid- to high-mass range, raising questions for further astronomical work. We also compare the results from two different computer models for the primary hydrogen-burning stage of stellar evolution. Through simulations, we show that misspecification at this stage of modeling can sometimes have a severe effect on inferred white dwarf masses. Encouragingly, our inferences on observed data are not particularly sensitive to the choice of computer model for this stage of stellar evolution.
Presentation Slides [.pdf]
Aneta Siemiginowska (CfA)
18 Sep 2012
Bayesian Methods in High Energy Astrophysics
High energy astrophysics data from space-based X-ray and gamma-ray missions such as Chandra or Fermi follow the Poisson distribution. In most situations the modeling has to account for instrumental effects characterized by a probability of detecting photons of a given energy at a particular detector channel, or a particular location on the detector. In addition, systematic uncertainties in the characterization of the instruments have to be taken into account as they often can exceed the statistical uncertainties in the analysis of bright sources. Our group (ICHASC) has developed Bayesian methods for many cases that are highly relevant to the analysis of high-energy data. I will present and discuss some of our methods and their applications.
This is a dry run for the SAMSI talk.
Presentation Slides [.ppt]
Crab time lapse movie [.mpg]
Dan Cervone (Harvard)
2 Oct 2012
Real-Time Light Curve Classification
Identifying and classifying variable light sources is a very active area of research in astronomy and astrophysics, but limited resources demand procedures that work well on relatively small numbers of observations. In this talk, we propose a framework for scheduling future observations in order to maximize classification information so that we can make better decisions under material constraints. We describe the intuition and implementation of our algorithm, and show preliminary results indicating an increase in the probability of a correct classification as a function of the number of observations made, based on simulated light curves from nine different variable source types.
Presentation slides [.pdf]
Rebekah Dawson (CfA)
16 Oct 2012
at 1 pm EDT
Applications of Bayesian statistics to understanding the origin of "hot Jupiters"
One of the biggest surprises in the field of extra-solar planets was the discovery of "hot Jupiters," a mysterious class of planets with masses similar to Jupiter but orbiting closer to their stars than Mercury. A Bayesian approaches allows us to characterize the orbits of hot Jupiters and their possible progenitors, to distinguish between planetary signals and false positives, and to assess which of several models for the origins of hot Jupiters is most likely. Today I'll present results from my thesis work, as well as work by my collaborators, with an emphasis on the statistical methods we have employed.
Presentation slides [.pdf]
27 Nov 2012
at 1 pm EST
BYOQ: Bring your own questions
An Open Q&A session where statisticians can clear up any residual niggling doubts about any aspect of astronomical data analysis, data gathering techniques, instrumentation, data archives, astronomical objects and their spatial, spectral, and temporal behavior, etc.
Pavlos Protopapas (CfA)
29 Jan 2013
at 1:15pm EST
Automatic classification of astronomical variables in catalogs with missing data
I will present an automatic classification method for astronomical catalogs with missing data a common issue arising when combining multiple catalogs together. We used Bayesian networks -- a probabilistic graphical model -- that performs inferences to predict the missing values given the observed data and the dependency relationships between the variables. We used an iterative process where we performed expectation maximization to estimate the missing values using the current learned network, and learn the structure of the network that uses the imputed data. We tested our model by creating a classifier for variable stars from four astronomical catalogs: MACHO, SAGE, 2MASS and UBVI and compared the results with the results obtained with classifiers learned with a subset of those catalogs. We found that using the catalogs with missing data improved the classification performance by 15% in efficiency and by 8% when comparing to traditional missing data approaches while the computational cost remains the same.
Presentation [.pdf]
Kathy Reeves (CfA)
19 Feb 2013
X-ray and Extreme Ultraviolet Observations of GOES C8 Solar Flare Events
Katharine K. Reeves, Trevor A. Bowen, Paola Testa
We present an analysis of soft X-rays (SXR) and extreme-ultraviolet (EUV) imaging and spectral observations of solar flares with an approximate C8 GOES class. Our constraint on peak GOES SXR flux allows for the investigation of correlations between various flare parameters. We show that the the duration of the decay phase is proportional to the duration of its rise phase. Additionally, we show significant correlations between the radiation emitted in the rise and decay phases of a flare: the total radiated energy of a given flare is proportional to the energy radiated during the rise phase alone. This partitioning of radiated energy between the rise and decay phases is observed in both soft X-ray (SXR) extreme ultraviolet (EUV) wavelengths. Though observations from the EUV Variability Experiment (EVE) show significant variation in the behavior of individual EUV spectral lines during different C8 events, we show that the broadband EUV emission is well constrained. Furthermore, using GOES and AIA data, we determine several thermal parameters of these events: temperature, volume, density, and emission measure. Analysis of these parameters demonstrate that the longer duration solar flares are cooler events with larger volumes capable of emitting vast amounts of radiation. The shortest C8 flares are typically the hottest events, smaller in physical size, and have lower associated total energies. These relationships are directly comparable with several sample scaling laws and flare loop models.
Presentation Slides [.pdf]
Movies: 2011-12-27 ; 2012-08-32 ; EVEspectrum-4 [.mov]
Xu Jin (UC Irvine)
26 Feb 2013
Fully Bayesian Analysis of Calibration Uncertainty In High Energy Spectral Analysis
Systematic instrumental uncertainties in astronomical analyses have been generally ignored due to the lack of robust principled method, though importance of incorporating instrumental calibration uncertainty is widely realized by users and instrument builders. Ignorance of calibration uncertainty can cause bias in the estimate of source model parameters and underestimate the variance. In this talk, we focus on incorporating uncertainty of affective area curves to source model fitting. Principle component analyses method is explored to efficiently represent affective area curve and energy redistribution matrix, bringing in significant advantage in computing and sampling. Then, the application of Bayesian approach to incorporate the calibration uncertainty into spectral analysis of high-energy data is presented and three different sampling schemes are discussed in detail. We demonstrate the comparison of results from these three schemes using Chandra data. Here, we concentrate on FullBayes Model, which has the internal advantage that data itself provides not only the information of source parameters but also the information of calibration uncertainty. It is verified that implementing FullBayes Model can result in more accurate and efficient estimate of source parameters.
Presentation slides [.pdf]
Jeff Scargle (NASA/Ames)
5 Mar 2013
Phillips Auditorium
CfA, 60 Garden St.
Adventures in Modern Time Series Analysis: From the Sun to the Crab Nebula and Beyond.
With the observations of long, precise, and finely sampled time series the Age of Digital Astronomy is uncovering and elucidating energetic dynamical processes throughout the Universe. Fulfilling these opportunities requires effective data analysis techniques that can rapidly and automatically implement advanced concepts. With various colleagues I have developed tools ranging from simple but optimal histograms to time and frequency domain analysis for arbitrary data modes and time sampling. Examples to be shown include 3+ cycles of solar chromospheric variability, gamma-ray activity in the Crab Nebula, active galactic nuclei and gamma-ray bursts.
Presentation slides: [.pdf] ; [.ppt] ; [.pptx]
Min Shandong (UC Irvine)
2 Apr 2013
Bayes Factors
There is an important class of model selection problems in astrophysics where the standard asymptotics of the likelihood ratio test do not apply. This project will study in detail the use of the Bayes Factor for emission line detection in spectral analysis. We develop a method to quantify the typically strong prior dependency of the Bayes Factor, compare the results with those obtained with posterior predictive p-values and the traditional likelihood ratio test in a simulation study, and give suggestions about how to set up the prior in the context of line detection problem. We will also talk about the efficiency and accuracy of the available methods to calculate Bayes Factors and propose a new method based on parallel MCMC.
Presentation slides [.pdf]
8 Apr 2013
7:30pm-9:00pm PDT
Monterrey, CA
131. Astrostatistics in High Energy Astrophysics - Session in memory of Alanna Connors
Lazhi Wang & David Jones (Harvard)
16 Apr 2013
Separating Overlapping Astronomical Sources
In astronomical observations, it is often the case that sources are situated close enough together that they cannot be fully resolved instrumentally and it is of interest to infer the number of individual sources, their locations, and their respective intensities. The resolution of the detector is characterized by the point spread function (PSF) which describes the spatial distribution of observed photons from a point source. Convolving a number of sources with the PSF results in a finite mixture model. We further incorporate spectral models, background contamination, and a latent Poisson process for the number and positions of the sources. We fit the resulting multilevel model with RJMCMC (Richardson and Green 1997) to separate the sources and obtain posterior distributions for the number of sources and their individual parameters. This overall approach has the benefit of being able to incorporate further complexities such as non-uniform background and asymmetric PSFs. Overfitting problems are avoided because knowledge of the PSF means the spread of the mixture components is determined making the inference relatively insensitive to the choice of prior on the number of sources.
Presentation slides [.pdf]
Dark Sources Detection
The goal of source detection is often to obtain the luminosity function, which specifies the relative number of sources at each luminosity for a population. Of particular interest in my project is the existence of dark sources in the population. In this talk, I will first briefly review the problem and the Bayesian model, in which a zero-inflated gamma distribution is used to model the intensity of sources. Secondly, I will discuss the weakly informative prior we use for the hyper-parameters in the model and show the frequency coverage of the probability intervals in two different simulation settings. Thirdly, model comparison using posterior predictive p-values will be discussed in detail to identify the existence of dark sources in the population. Finally, a generalized model for dealing with overlapping sources will be introduced.
Presentation slides [.pdf]
Jack Steiner (CfA)
23 Apr 2013
Accretion Lags and X-ray Heating
Accreting black hole binaries show strong correlation between X-ray and optical variability. In two systems, we find that X-rays lag behind the optical with a characteristic delay of weeks. This behavior is most readily attributed to viscous delay as inflowing gas traverses the disk from outer to inner annuli. By applying a model primarily comprised of a slowly-varying alpha disk to describe the luminosity fluctuations, we are able to successfully map between the X-ray and the optical. Using this model, we measure alpha and explore the possibility of its dependence on luminosity. Additionally, we discover a strong dependence of X-ray heating upon the geometry of the binary system. Meanwhile, by removing the optical variance tied to the X-rays, this technique may be useful in recovering dynamical information from outbursting black holes, a feature of particular importance for those exceptionally X-ray bright systems which balk traditional methods.
Presentation slides [.pptx]
May 8-10
venue var.
ICHASC/C-BAS Internal Workshop at Cambridge, MA
Wednesday May 8 (Room M-340, 3rd floor, 160 Concord)
9-Noon: New projects, overlapping sources, and pre-existing catalogs
1:30-3:45pm: Calibration, joint spectro-temporal analysis
4:15-5:45pm: Bayes Factors, Statistical Computation
Thursday May 9 (Room M-340, 3rd floor, 160 Concord)
9-Noon: writing groups
12:30-1:15pm: Hinode Mission Control Tour
2:30-5:45pm: real time classification, solar spatial segmentation, solar activity, sunspot classification
Friday May 10 (Room 705, Science Center)
10am-1pm: logN-logS, adaptive smoothing, other
Paul Baines slides [.pdf]
afternoon: writing, wrap-up
Brandon Kelly (UCSB)
9 July 2013
Noon EDT
Phillips Auditorium
CfA, 60 Garden
Characterizing Lightcurves of AGN and other Stochastic Variables in the Era of Time-Domain Astronomy
Current and future time-domain surveys will provide, for the first time, well-sampled multi-wavelength lightcurves for large numbers of variables objects, such as AGN and variable stars. Such lightcurves will provide a valuable resource for studying the physics of such objects and their demographics, leading to many new discoveries. However, these data sets present methodological and computational challenges, including irregular sampling of the lightcurves and contamination by measurement errors, and the need to handle massive amounts of data. In this talk I will discuss the use of continuous time stochastic processes for quantifying the variability in AGN and other variable sources, and compare with more traditional approaches. Although I will focus on the methodological and computational aspects of these models, I will present some results from applying these statistical time series models to AGN optical and X-ray lightcurves, briefly illustrating how they can be used to provide astrophysical insight. I will conclude with a discussion of potential future directions and applications in the era of LSST.
Presentation slides [.pdf]
Katy McKeough (CMU/CfA-REU)
13 Aug 2013
1pm EDT
Library Space
CfA, 60 Garden
Effect of Cosmic Microwave Background on X-ray Radiation of High Redshift Jets
Slides [.pdf]

Fall/Winter 2004-2005
Siemiginowska, A. / Connors, A. / Kashyap, V. / Zezas, A. / Devor, J. / Drake, J. / Kolaczyk, E. / Izem, R. / Kang, H. / Yu, Y. / van Dyk, D.
Fall/Winter 2005-2006
van Dyk, D. / Ratner, M. / Jin, J. / Park, T. / CCW / Zezas, A. / Hong, J. / Siemiginowska, A. & Kashyap, V. / Meng, X.-L.
Fall/Winter 2006-2007
Lee, H. / Connors, A. / Protopapas, P. / McDowell, J., / Izem, R. / Blondin, S. / Lee, H. / Zezas, A., & Lee, H. / Liu, J.C. / van Dyk, D. / Rice, J.
Fall/Winter 2007-2008
Connors, A., & Protopapas, P. / Steiner, J. / Baines, P. / Zezas, A. / Aldcroft, T.
Fall/Winter 2008-2009
H. Lee / A. Connors, B. Kelly, & P. Protopapas / P. Baines / A. Blocker / J. Hong / H. Chernoff / Z. Li / L. Zhu (Feb) / A. Connors (Pt.1) / A. Connors (Pt.2) / L. Zhu (Mar) / E. Kolaczyk / V. Liublinska / N. Stein
Fall/Winter 2009-2010
A.Connors / B.Kelly / N.Stein, P.Baines / D.Stenning / J. Xu / A.Blocker / P.Baines, Y.Yu / V.Liublinska, J.Xu, J.Liu / Meng X.L., et al. / A. Blocker, et al. / A. Siemiginowska / D. Richard / A. Blocker / Xie X. / Xu J. / V. Liublinska / L. Jing
AcadYr 2010-2011
Astrostat Haiku / P. Protopapas / A. Zezas & V. Kashyap / A. Siemiginowska / K. Mandel / N. Stein / A. Mahabal / Hong J.S. / D. Stenning / A. Diaferio / Xu J. / B. Kelly / P. Baines & I. Udaltsova / M. Weber
AcadYr 2011-2012
A. Blocker / Astro for Stat / B. Kelly / R. D'Abrusco / E. Turner / Xu J. / T. Loredo / A. Blocker / P. Baines / A. Zezas et al. / Min S. & Xu J. / O. Papaspiliopoulos / Wang L. / T. Laskar
AcadYr 2012-2013
N. Stein / A. Siemiginowska / D. Cervone / R. Dawson / P. Protopapas / K. Reeves / Xu J. / J. Scargle / Min S. / Wang L. & D. Jones / J. Steiner / B. Kelly / K. McKeough