A functional literacy in AstroStatistics is becoming a necessity for astronomers who are confronted with high-quality datasets from modern instruments. New astronomical datasets pose unprecedented data analytic challenges with complex data that aim to improve our understanding of the Universe, provided that they are carefully analyzed and uncertainties are accounted for correctly. This requires descriptive science-driven statistical models and methods that relate our best underlying physical processes to observables. The field of AstroStatistics is at this intersection of observational Astronomy, Statistics, and data science. Our session is aimed at making Astronomers familiar with newer techniques that are becoming available, with the goal of expanding the analysis toolkit that is available to them. We therefore review basic methods, covering topics like the least-squares fitting, likelihoods, Machine Learning concepts that allow classification and clustering, and Bayesian analysis, in a series of three lectures by experts in the field.
The speakers also participated in informal discussions during the Topics in AstroStatistics Splinter Session later in the afternoon.
Chair: Aneta Siemiginowska (Harvard-Smithsonian Center for Astrophysics, Cambridge, MA)
The likelihood function is a necessary component of Bayesian statistics but not of frequentist statistics. The likelihood function can, however, serve as the foundation for an attractive variant of frequentist statistics sometimes called likelihood statistics. We will first discuss the definition and meaning of the likelihood function, giving some examples of its use and abuse - most notably in the so-called prosecutor's fallacy. Maximum likelihood estimation is the aspect of likelihood statistics familiar to most people. When data points are known to have Gaussian probability distributions, maximum likelihood parameter estimation leads directly to least- squares estimation. When the data points have non-Gaussian distributions, least-squares estimation is no longer appropriate. We will show how the maximum likelihood principle leads to logical alternatives to least squares estimation for non-Gaussian distributions, taking the Poisson distribution as an example.
The likelihood ratio is the ratio of the likelihoods of, for example, two hypotheses or two parameters. Likelihood ratios can be treated much like un-normalized probability distributions, greatly extending the applicability and utility of likelihood statistics. Likelihood ratios are prone to the same complexities that afflict posterior probability distributions in Bayesian statistics. We will show how meaningful information can be extracted from likelihood ratios by the Laplace approximation, by marginalizing, or by Markov chain Monte Carlo sampling.
This tutorial presentation will introduce some of the key ideas and techniques involved in applying Bayesian methods to problems in astrostatistics. The focus will be on the big picture: understanding the foundations (interpreting probability, Bayes's theorem, the law of total probability and marginalization), making connections to traditional methods (propagation of errors, least squares, chi-squared, maximum likelihood, Monte Carlo simulation), and highlighting problems where a Bayesian approach can be particularly powerful (Poisson processes, density estimation and curve fitting with measurement error). The "graphical" component of the title reflects an emphasis on pictorial representations of some of the math, but also on the use of graphical models (multilevel or hierarchical models) for analyzing complex data. Code for some examples from the talk will be available to participants, in Python and in the Stan probabilistic programming language.
As astronomical datasets continue to increase in size and complexity, innovative statistical and machine learning tools are required to address the scientific questions of interest in a computationally efficient manner. I will introduce some tools that astronomers can employ for such problems with a focus on clustering and classification techniques. I will introduce standard methods, but also get into more recent developments that may be of use to the astronomical community.
Vinay Kashyap (vkashyap @ cfa . harvard . edu) Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)