Last Updated: 20170418

AAS 230: Special Session

Topics in AstroStatistics

Monday, June 5, 2017

10:00am-11:30am // Salon 2
| Description | Schedule | Contacts | changelog |


The field of AstroStatistics is at the intersection of observational Astronomy, Statistics, and data science. A functional literacy in AstroStatistics is becoming a necessity for astronomers who are confronted with high-quality datasets from modern instruments. New astronomical datasets pose unprecedented data analytic challenges with complex data that aim to improve our understanding of the Universe, provided that they are carefully analyzed and uncertainties are accounted for correctly. This requires descriptive science-driven statistical models and methods that relate our best underlying physical processes to observables. For a number of AAS meetings, we have been organizing lectures on basic methods, and these have proven to be highly popular. We plan to continue this series with talks covering topics such as hypotheses testing and Machine Learning applied to Big Data problems, in a series of three lectures by experts in the field.


Chair: Edward Robinson (University of Texas)

10:00 am: Yang Chen (University of Michigan)

The Bayesian Statistics behind Calibration Concordance

Abstract: Calibration data for instruments used for astrophysical measurements are usually obtained by observing different astronomical objects with well-understood characteristics simultaneously with different detectors. How to adjust the effective areas of the detectors to achieve concordance among the sources observed by the several detectors is the problem of interest. The calibration concordance problem can be addressed by introducing a log-Normal approach for a multiplicative mean model given by physics. In this context, I will introduce concepts of Bayesian hierarchical model, log-Normal regression model and shrinkage estimators, and give intuitive interpretations of the model and the results. Model fitting is achieved by running the Markov chain Monte Carlo (MCMC) algorithm, the basics of which will also be covered.

Yang Chen is an Assistant Professor of Statistics at UMichigan.

10:30 am: Chad Schafer (Carnegie-Mellon University)

The Potential of Deep Learning with Astronomical Data

Abstract: Modern astronomical surveys yield massive catalogs of noisy high-dimensional objects, e.g., images, spectra, and light curves. Valuable information stored in individual objects can be lost when ad hoc approaches of feature extraction are used in an effort to build data sets amenable to established data analysis tools. Deep learning procedures provide a promising avenue to enabling the use of data in their raw form and hence allowing both for estimates of greater accuracy and for novel discoveries with greater confidence. This talk will give an overview of deep learning and its potential in astronomical applications.

Chad Schafer is an Associate Professor in the Department of Statistics at CMU. He is also the co-Chair of the LSST Informatics and Statistics Science Collaboration.

11:00 am: Pavlos Protopapas (Harvard University)

Machine Learning applied to Timing Analysis

Abstract: The benefits of good predictive models in astronomy lie in early event prediction systems and effective resource allocation. Current time series methods applicable to regular time series have not evolved to generalize for irregular time series. In this talk, I will describe two Recurrent Neural Network methods, Long Short-Term Memory (LSTM) and Echo State Networks (ESNs) for predicting irregular time series. Feature engineering along with a non-linear modeling proved to be an effective predictor. For noisy time series, the prediction is improved by training the network on error realizations using the error estimates from astronomical light curves. In addition to this, we propose a new neural network architecture to remove correlation from the residuals in order to improve prediction and compensate for the noisy data. Finally, I show how to set hyperparameters for a stable and performant solution correctly. In this work, we circumvent this obstacle by optimizing ESN hyperparameters using Bayesian optimization with Gaussian Process priors. This automates the tuning procedure, enabling users to employ the power of RNN without needing an in-depth understanding of the tuning procedure.

Pavlos Protopapas is the Scientific Program Director and Lecturer at the Institute for Applied Computational Science at Harvard.


Vinay Kashyap (vkashyap @ cfa . harvard . edu)
Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)


2017-mar-02: set up page.
2017-mar-10: updated affiliations.
2017-apr-18: added Chair.



Mon 5 Jun 2017
Salon 2

Yang Chen
Chad Schafer
Pavlos Protopapas