AstroStat Talks 2015-2016
Last Updated: 2017jan25


Topics in Astrostatistics

Statistics 310, Harvard University
Statistics 281, University of California, Irvine

AY 2015-2016


Instructor Prof. Meng Xiao Li (HU)
  Prof. David van Dyk (ICL)
  Prof. Yu Yaming (UCI)
Schedule Tuesdays 1:07PM - 2:30PM ET
Location SciCen 706

Joe DePasquale (CfA)
8 Sep 2015
1pm EDT
SciCen 706
The Art & Science of Image Processing
Abstract: The Chandra Communications & Public Engagement group at the Center for Astrophysics plays a central role in representing NASA's Chandra X-ray Observatory to the public. Bi-weekly Chandra press releases provide high-quality peer-reviewed science content for non-experts, and the astronomical imagery produced to support such releases can be considered the public face of Chandra. Each public image is unique and a considerable effort is put into creating imagery that is not only aesthetically pleasing, but also accurately and effectively communicates the science. After a brief overview of the Chandra program and the unique challenges in processing X-ray data, I will walk through the creation of an image for public release, providing key insights into the processing involved. With an eye towards the future, I'll briefly describe an ongoing project to merge the end result "pretty pictures" with science analysis.
Presentation slides: [.key] (v6+) [.pdf]
Hyungsuk Tak (Harvard)
15 Sep 2015
12:37pm EDT
SciCen 706
Microlensing corrections in TDC
Abstract: I briefly overview the time delay estimation problem and introduce a microlensing effect that arises when stars in the lensing galaxy introduce independent flickering noises into the quasar images. If the timescale of this effect is much larger than that of the quasar variability, the light curves can have different long-term trends. Because our curve-shifting assumption does not hold with the different long-term trends, our model produces (approximate) marginal posterior distribution of the time delay that has several modes near margins of the entire range of the time delay. These modes near margins tend to overwhelm the density near the true time delay, making it hard to estimate the true time delay. I suggest a way to estimate the time delay, reducing the effect of microlensing.
Slides [.pdf]
Xiao-Li Meng (Harvard)
06 Oct 2015
12:15pm EDT
SciCen 706
Seeking Effective Adjustments for Effective Areas
Discussion of the cross-calibration shrinkage method. Given I instruments and J sources, measurements in a variety of combinations thereof, to determine a regression correction to the underlying effective areas to bring them all in sync.
Slides [.pdf]
David Jones (Harvard)
13 Oct 2015
1:15pm EDT
SciCen 706
Designing Test Information and Test Information in Design
Abstract: Since telescope time is limited, astronomers wishing to classify lightcurves must carefully select future time points at which to observe lightcurves of interest in order to maximize the information that will be gained for classification. This work proposes a framework for constructing measures of test / classification / model selection information and explores how to use them in experimental design problems such as lightcurve classification. Degroot (1962) developed a general framework for constructing Bayesian measures of the expected information that an experiment will provide for estimation, and our framework analogously constructs frequentist and Bayesian measures of information for hypothesis testing. In contrast to estimation information measures that are typically used in experimental design for surface estimation, test information measures are most useful in experimental design for model selection and classification problems. Indeed, our framework suggests a probability based measure of test information, which in hypothesis test applications has more appealing properties than variance based measures. We also extend a result of Nicolae et al. (2008) linking test and Fisher information, and propose a fundamental coherence requirement for test information measures.
Slides [.pdf]
Jane Huang (HU)
Peter Blanchard (HU)

27 Oct 2015
1:07pm EDT
SciCen 706
Searching for vibronic progressions in the diffuse interstellar bands [JH]
Abstract: The diffuse interstellar bands, observed at optical, infrared, and UV wavelengths, are a series of hundreds of absorption bands observed in stellar spectra due to molecules in intervening clouds. Though the first DIBs were discovered about a century ago, so far only one of the molecular carriers has been confirmed. Identifying more of these carriers would add to our understanding of how complex molecules form in the interstellar medium.To constrain the characteristics of candidate carriers, Duley and Kuzmin (2010) suggested searching for low-energy harmonic progressions among DIBs in order to identify bands that may arise from torsional motion of large molecules. I will discuss the use of agglomerative clustering methods to search for harmonic progressions and examine the likelihood of such progressions arising due to chance alignments.
Slides [.pptx]
The Impact of Positional Uncertainty on Gamma-Ray Burst Environment Studies [PB]
Abstract: While it is now established that long-duration gamma-ray bursts (LGRBs) are a rare outcome of the death of some massive stars, it remains unclear what special conditions are required for the production of an LGRB. Studies of the preferred locations of LGRBs within their host galaxies can shed light on this open question. I use ground-based detections of LGRB afterglows to locate the bursts within high-resolution images of their faint host galaxies obtained using the Hubble Space Telescope. I measure the distribution of LGRB offsets from the centers of their host galaxies, and compare the brightness of the burst position relative to the total host light distribution. The dominant limiting factor in these studies comes from the uncertainty on the position of the LGRB relative to its host galaxy. In this talk, I will discuss the impact of positional uncertainty on host galaxy identification, and offset and light distribution measurement. It is important to address this issue to avoid biases in the full sample distributions so that LGRB distributions can be understood and compared to those for other types of stellar explosions. After a careful consideration of uncertainties, I will present the results I obtain and their implications for the progenitors of LGRBs.
Slides [.pdf]
Yang Chen & Xufei Wang
10 Nov 2015
1:07pm EST
SciCen 706
Calibration Concordance
Follow-up to Xiao-Li Meng's talk on Oct 6
Part 1: Explanation of Multiplicative Model
Part 2: log-Normal Model (shrinkage estimator, variance estimator, additive model)
Part 3: Poisson Model (simplified version)
Part 4: Question for discussion
Presentation Slides [.pdf]
Hyungsuk Tak
24 Nov 2015
1:07pm EST
SciCen 706
Down-Up Metropolis-Hastings Algorithm for Multimodality
Abstract: I suggest a down-up Metropolis-Hastings (DUMH) algorithm that expedites Markov chain's jumps between modes of a multi-modal distribution in a simple and fast manner. This algorithm is essentially a Metropolis-Hastings (MH) algorithm that generates a proposal via two steps, a downhill step and an uphill step. Given the current state, an intermediate proposal is generated by a Metropolis algorithm with a reciprocal ratio of the target densities in the acceptance probability so that this Metropolis algorithm prefers a downward movement in density. For example, if the density of the intermediate proposal is smaller than that of the current state, then the intermediate proposal is accepted with a probability one. Given the intermediate proposal, a final proposal is generated by another Metropolis algorithm with a typical acceptance probability that prefers an upward movement in density. This down-up movement in density increases the chance that the final proposal is at a different mode. The DUMH algorithm accepts the final proposal with an MH acceptance probability. Because this MH acceptance probability involves a ratio of intractable integrations, the DUMH algorithm uses an auxiliary variable to cancel out the intractable ratio. Simulation results show that the DUMH algorithm explores some high-dimensional and multi-modal distributions more effectively than a random-walk Metropolis algorithm.
Slides [.pdf]
Kaisey Mandel (CfA)
26 Jan 2016
1:07pm EST
SciCen 706
Supernova Cosmology in the Near-Infrared with Hierarchical Bayesian Light Curve Models
Abstract: The Nobel Prize-winning discovery of the accelerating expansion of the Universe was made by astronomers using optical observations of the brightness time series of faraway exploding stars (Type Ia supernova light curves) to determine cosmological distances. Current and future optical supernova surveys aim to determine the physical nature of the mysterious dark energy driving the acceleration. However, these efforts are now limited by systematic, rather than statistical errors. The fortuitous properties of supernova light curves in the near-infrared (NIR) offer a powerful strategy for improving distance estimates and cosmological constraints. I have constructed a hierarchical Bayesian model for optical and NIR supernova light curves incorporating multiple random effects and uncertainties, including host galaxy dust, measurement error, and intrinsic supernova variations across time and wavelength, to determine precise and accurate supernova distances. I will describe ongoing efforts to trace the history of cosmic expansion by applying this statistical model to analyze new, large datasets of ground-based NIR observations of nearby supernovae, as well as recent Hubble Space Telescope NIR observations of cosmologically distant supernovae discovered by the Pan-STARRS and Dark Energy Survey supernova searches.
Presentation slides [.pdf]
Xiyun Jiao (Imperial)
2 Feb 2016
6:07pm GMT
10:07am PST/1:07pm EST
Next-generation Gibbs-type Samplers: Combining Strategies to Boost Efficiency
Abstract: Although the Data Augmentation (DA) algorithm and the Gibbs sampler are widely used tools for obtaining a sample from the posterior distribution under complex Bayesian models, they are sometimes notoriously slow to converge. Thus, a number of strategies have been proposed to improve their convergence properties. Among them, we focus on three methods: the Marginal Data Augmentation (MDA), Ancillarity-Sufficiency Interweaving Strategy (ASIS), and Partially Collapsed Gibbs (PCG) sampling. To further improve the convergence of Gibbs-type samplers, we propose strategies for combining MDA, ASIS, PCG sometimes in conjunction with Metropolis-Hastings type updates. We construct a general framework to combine these methods into coherent samplers and guarantee that the combined samplers maintain their target stationary distributions and can only improve the convergence properties of their parent Gibbs-type samplers. We demonstrate the implementation and efficiency of our framework by implementing it to fit a factor analysis model and a cosmological hierarchical model, which is the motivation of our work.
Presentation slides [.pdf]
Wang Xufei and Chen Yang (Harvard)
16 Feb 2016
1:07pm EST
SciCen 706
Progress report of 'Calibration with Multiplicative Means but Additive Errors: A Log Normal Approach'
Abstract: Useful information to calibrate instruments used for astrophysical measurements is usually obtained by observing different sources with well-understood characteristics simultaneously with different detectors. This requires a careful modeling of the mean signals, the intrinsic source variations, and measurement errors. Because our data are typically large (>>30) photon counts, we propose an approximate log-normal model, with the advantage of permitting imperfection in the multiplicative mean model to be captured by the residual variance. The calibration takes an analytically tractable form of power shrinkage, with a half-variance adjustment to ensure an unbiased multiplicative mean model on the original scale.
In this presentation, we will talk about the current progress regarding the properties and model fittings (frequentist and Bayesian methods) of the log-normal regression model. We will also discuss the potential issues and pitfalls of fitting a Poisson regression model, which is more realistic but more restrictive. Simulation studies and real data fittings will be shown.
Presentation slides [.pdf]
1 Mar 2016
9:00am IST
29 Feb 7:30pm PST / 10:30pm EST

IACHEC Calibration Uncertainties Working Group Meeting
Preliminary agenda:
Vinay Kashyap, Introduction to pyBLoCXS
Slides [.pdf]
Konrad Dennerl, Modeling RMFs
Herman Marshall, Introduction to Calibration Concordance Project
Keith Arnaud, Updates to XSPEC
Xiao-Li Meng/Yang Chen/Xufei Wang, Calibration Concordance
Shijing Si (Imperial)
22 Mar 2016
6:07pm GMT
10:07am PST/1:07pm EST
Bayesian Hierarchical Models for Stellar Evolution
Abstract: In astrophysics, we often aim to estimate a parameter for each of a number of objects in a population. For example, we may want to estimate the age of each of a sample of halo white dwarf (WD). The standard strategy is to separately study each of the objects using case-by-case analyses, and then in a follow-up analysis, study the distribution of the fitted parameters across the population. In this research, we develop novel methods that allow us to take advantage of existing software designed for such case-by-case analyses to simultaneously fit parameters of both the individual objects and the parameters that quantify their distribution across the population. Our methods are based on Bayesian hierarchical modelling which is known to produce parameter estimators for the individual objects that are on average closer to their true values than estimators based on case-by-case analyses. We verify this in the context of estimating ages of Halo white dwarfs via a series of simulation studies. We apply our techniques to three astrophysical problems. The first one is essentially meta-analysis of the distance modulus to the large magenllanic cloud (MLC) by combining some published results. The second project is to obtain more precise estimates of WD ages by fitting their photometric data in a hierarchical model. The third problem is to study the initial-final mass relationships (IFMR) via combining three stellar clusters.
Presentation slides [.pdf]
Jeremy Drake (CfA)
19 Apr 2016
1:07pm EDT
SciCen 706
Monte Carlo Methods for Treating and Understanding Highly-Correlated Instrument Calibration Uncertainties.
Abstract: Accounting for measurement uncertainty is a fundamental aspect of any credible scientific experiment. In astrophysical observations, there are two main sources of uncertainty: noise in the acquired data, and the uncertainty in the calibration of the instrument used to obtain them. Astronomers traditionally only consider the former and calibration uncertainties are almost universally ignored. Modern X-ray observatories, such as Chandra and XMM-Newton, frequently acquire data for which calibration uncertainties are likely to be the dominant source of error. However, calibration uncertainties are riddled with complicated correlations that render them both technically challenging to understand and to employ in data analysis. Here, we describe Monte Carlo methods developed to include highly-correlated instrument performance uncertainties in typical astrophysical model parameter estimation studies. We will also describe how these methods can be used in combination with observations of cosmic X-ray sources by one or more observatories to refine the calibration uncertainties themselves.
Slides [.pdf]
Sara Algeri (Imperial)
10 May 2016
6:07pm BST
10:07am PDT/1:07pm EDT
Multiple hypothesis testing and testing one hypothesis multiple times: two sides of the same coin?
Abstract: In statistics, the problem of testing one hypothesis multiple times can be formulated in terms of hypothesis testing when a nuisance parameter is present only under the alternative. Each possible value of the nuisance parameter specifies a different alternative hypothesis and a unique global p-value is provided to summarize the statistical evidence in support (or against) the null hypothesis. From a physics perspective, this scenario occurs quite often in the searches for new signals over an energy or mass spectrum, and in both the nested and non-nested frameworks.
An alternative way to search for new emissions is to refer to the classical and widely known multiple hypothesis testing approach. Separate tests of hypothesis are conducted at different locations producing an ensemble of local p-values, the smallest is reported as evidence for the new resonance, once adequately adjusted to control the false detection rate (type I error rate).
While multiple hypothesis testing procedures are both easy and quick to implement, they may be overly conservative in terms of the false detection rate. On the other hand, testing one hypothesis multiple times methods are robust with respect to false discovery rate and power, but may require considerable computational effort when dealing with complicated models.
The aim of this talk is to review both approaches, evaluate their performance with respect to sample sizes, statistical power, false detection rate and computational effort required. Finally, a simple graphical tool is provided to identify recurrent scenarios where a simple multiple hypothesis testing procedure can be used to provide valid inference with respect to stringent significance requirements, without encountering the usual problem of over-conservativeness.
Presentation slides [pdf]
arXiv:1701.06820 [url]
Nathan Stein (UPenn)
17 May 2016
1:07pm EDT
10:07am PDT/6:07pm BST
Valid Statistical Comparisons Without Valid MCMC Output
Abstract: We consider model comparison when test statistics are derived from the output of a Markov chain Monte Carlo algorithm. We investigate this comparison from a classical hypothesis testing perspective and find that the power of such tests can exhibit surprising behavior. First, power is not guaranteed to increase monotonically with the number of MCMC iterations, so stopping the MCMC runs early can lead to better performance. Second, power is not guaranteed to increase when the MCMC chains are coupled in an effort to reduce noise. We find that a promising direction for developing more powerful tests relies on the fact that valid Monte Carlo tests can be constructed from statistics that preserve exchangeability but not independence. This problem is motivated by the analysis of X-ray images of quasars, in which the goal is to determine evidence for X-ray jets by evaluating departures from a null model that includes a quasar, modeled as a single point source, but no jet. In this application, the test statistics are computed from the output of an MCMC algorithm that samples from the posterior distribution of a model that allows flexible, nonparametric departures from the quasar-only null model.
Chunzhe Zhang (UC Davis)
24 May 2016
10:07am PDT
1:07pm EDT/6:07pm BST
Model Selection for Galaxy Shapes
Abstract: This project is about a novel image model selection procedure for galaxy shapes. In recent astronomy literature, the method used to perform model selection for galaxy images is based on Chi^2 statistics, however, it is very often that different kinds of fitted galaxy shape models have very similar or even the same Chi^2 value. Our new procedure is based on BIC, but with a novel initial parameter estimation method for likelihood maximization. Extensive simulation studies of our procedure have shown promising results.
Presentation slides [.pdf]
Jeff Andrews (CfA)
28 Jun 2016
1:07pm EDT
10:07am PDT/6:07pm BST
SciCen 706
Beyond Population Synthesis: Modeling X-ray binary populations
Abstract: Recent studies of high mass binaries have traditionally employed population synthesis codes to link evolutionary models with observed populations. However for certain types of binaries, particularly those that are rare or short-lived, population synthesis is a poor technique; only a small fraction of simulated systems appear similar to those observed. For more complex problems, the computational expense can be prohibitive. Correlating high mass X-ray binaries (HMXB) with regions of recent star-formation in nearby galaxies is one such problem. I will demonstrate an alternative to traditional population synthesis based on a Markov-Chain Monte Carlo method which uses the spatially resolved star formation history as a prior on the HMXBs' birth location and age. We develop our model for HMXBs in the Small Magellanic Clouds to quantitatively constraining their evolutionary histories and map their likely birth locations. With adaptations, this method can be applied to populations of compact object binaries in other nearby galaxies.
Presentation slides [.pdf]
Saku Vrtilek (CfA)
12 Jul 2016
1:07pm EDT
10:07am PDT/6:07pm BST
SciCen 706
A Hertszprung-Russell analog for X-ray binaries
Abstract: Color-color (CI) plots (which provide spectral information over different energy ranges) and color-intensity (CI) plots (which show brightness variations for a given color) are common and easily-obtained measurements that have long been used to classify accreting binary types. We have found that when CC and CI diagrams are combined in a single color-color-intensity (CCI) plot various types of X-ray binaries separate into complex, but geometrically distinct volumes.
CCI diagrams are in fact a three-dimensional analog for accreting binaries of the classic Hertzprung-Russell diagram which proved so fundamental to our understanding of single star evolution. We wish extend the investigation of CCI diagrams in several ways. First, to optimize the choice of energy bands and test statistical methods to differentiate among types of accreting binaries and the states of individual objects. Second, to apply the technique study behavior of accreting binaries in external galaxies. And finally to identify the position in the CCI diagram with observable parameters (temperatures of thermal components, spectral indices of power law components, absorbing column and inclination) and to more model dependent physical parameters (accretion rate, disk inner radius, emitting area of neutron star surface). Not only does our study address the decades old question of how to easily and unambiguously separate black hole from neutron systems, it will also provide important clues about how the different classes of accreting binaries are related to each other.
Presentation slides [.pdf]
Irina Udaltsova (UC Davis) & Vasileios Stampoulis (Imperial)
26 July 2016
10:07am PDT/1:07 EDT
6:07pm BST
Davis/SciCen 706
Bayesian Modeling of logN-logS
[IU] The study of astrophysical source populations is often conducted using the cumulative distribution of the number of sources detected at a given sensitivity, represented as a "log N - log S" curve. Direct estimation of log N - log S relationship is complicated by unobserved sources resulting from the detector-induced biases, background contamination, and uncertainty on both the source flux and the number of sources. Knowledge of the probability of non-detection allows us to correct for the non-ignorable missing data mechanism and to build a hierarchical Bayesian model leads to inference for physical model parameters and the log N - log S distribution. We present a procedure for examining the Bayesian model goodness-of-fit and propose a model selection approach of the number of source populations.
[VS] We extend the logN-logS by incorporating the spectral data from the observed sources in order to account for the uncertainty in the count-to-flux conversion factor gamma. We present the results of fitting the logN-logS model for the Chandra Deep Field South (CDFS) dataset for both the constant gamma case and the case with gamma uncertainty and discuss how the posterior estimates are affected.
Irina Udaltsova slides [.pdf]
Vasileios Stampoulis slides [.pdf]

Fall/Winter 2004-2005
Siemiginowska, A. / Connors, A. / Kashyap, V. / Zezas, A. / Devor, J. / Drake, J. / Kolaczyk, E. / Izem, R. / Kang, H. / Yu, Y. / van Dyk, D.
Fall/Winter 2005-2006
van Dyk, D. / Ratner, M. / Jin, J. / Park, T. / CCW / Zezas, A. / Hong, J. / Siemiginowska, A. & Kashyap, V. / Meng, X.-L.
Fall/Winter 2006-2007
Lee, H. / Connors, A. / Protopapas, P. / McDowell, J., / Izem, R. / Blondin, S. / Lee, H. / Zezas, A., & Lee, H. / Liu, J.C. / van Dyk, D. / Rice, J.
Fall/Winter 2007-2008
Connors, A., & Protopapas, P. / Steiner, J. / Baines, P. / Zezas, A. / Aldcroft, T.
Fall/Winter 2008-2009
H. Lee / A. Connors, B. Kelly, & P. Protopapas / P. Baines / A. Blocker / J. Hong / H. Chernoff / Z. Li / L. Zhu (Feb) / A. Connors (Pt.1) / A. Connors (Pt.2) / L. Zhu (Mar) / E. Kolaczyk / V. Liublinska / N. Stein
Fall/Winter 2009-2010
A.Connors / B.Kelly / N.Stein, P.Baines / D.Stenning / J. Xu / A.Blocker / P.Baines, Y.Yu / V.Liublinska, J.Xu, J.Liu / Meng X.L., et al. / A. Blocker, et al. / A. Siemiginowska / D. Richard / A. Blocker / Xie X. / Xu J. / V. Liublinska / L. Jing
AcadYr 2010-2011
Astrostat Haiku / P. Protopapas / A. Zezas & V. Kashyap / A. Siemiginowska / K. Mandel / N. Stein / A. Mahabal / Hong J.S. / D. Stenning / A. Diaferio / Xu J. / B. Kelly / P. Baines & I. Udaltsova / M. Weber
AcadYr 2011-2012
A. Blocker / Astro for Stat / B. Kelly / R. D'Abrusco / E. Turner / Xu J. / T. Loredo / A. Blocker / P. Baines / A. Zezas et al. / Min S. & Xu J. / O. Papaspiliopoulos / Wang L. / T. Laskar
AcadYr 2012-2013
N. Stein / A. Siemiginowska / D. Cervone / R. Dawson / P. Protopapas / K. Reeves / Xu J. / J. Scargle / Min S. / Wang L. & D. Jones / J. Steiner / B. Kelly / K. McKeough
AcadYr 2013-2014
Meng X.-L. / Meng X.-L., K. Mandel / A. Siemiginowska / S. Vrtilek & L. Bornn / Lazhi W. / D. Jones / R. Wong / Xu J. / van Dyk D. / Feigelson E. / Gopalan G. / Min S. / Smith R. / Zezas A. / van Dyk D. / Hyungsuk T. / Czerny, B. / Jones D. / Liu K. / Zezas A.
AcadYr 2014-2015
Vegetabile, B. & Aldcroft, T., / H. Jae Sub / Siemiginowska, A. & Kashyap, V. / Pankratius, V. / Tak, H. / Brenneman, L. / Johnson, J. / Lynch, R.C. / Fan, M.J. / Meng, X.-L. / Gopalan, G. / Jiao, X. / Si, S. / Udaltsova, I. & Zezas, A. / Wang, L. / Tak, H. / Eadie, G. / Czekala, I. / Stenning, D. / Stampoulis, V. / Aitkin, M. / Algeri, S. / Barnacka, A.
AcadYr 2015-2016
DePasquale, J. / Tak, H. / Meng, X.-L. / Jones, D. / Huang, J. / Blanchard, P. / Chen, Y. & Wang, X. / Tak, H. / Mandel, K. / Jiao, X. / Wang, X. & Chen, Y. / IACHEC WG / Si, S. / Drake, J. / Stampoulis, V. / Algeri, S. / Stein, N. / Chunzhe, Z. / Andrews, J. / Vrtilek, S. / Udaltsova, I. & Stampoulis, V. /