Last Updated: 20190204

AAS 233: Special Session 126

Machine Learning in Astronomical Data Analysis

Monday, 7 January 2019

2:00pm - 3:30pm PST

Room 607
Washington State Convention & Trade Center, Seattle WA
| Description | Schedule | Posters | Contacts | changelog |


Machine Learning is quickly becoming a popular method to analyze astronomical data. There is a great deal of interest among the astronomical community in the powerful techniques that are now being developed, with every session, workshop, or seminar relating to the subject having overflow audiences.

We are therefore organizing a ML-oriented special session at AAS 233. The goal of this session is to focus attention on new ML applications specific for astronomical data. Under the principle that it is better to learn with concrete examples, we seek to provide a forum for reporting on new applications and enhancements in existing methodologies. Modern telescopes collect a large amount of data, freely accessible via archives, to all scientists. With big datasets, come big opportunities. The SDSS, Kepler, and K2 datasets, the recently released Gaia DR2, the forthcoming LSST in the optical, ALMA, MWA, and SKA in the radio, SDO in the EUV, are perfect illustrations of the power of data to unlock new science. This session is designed to help us prepare to take advantage of these opportunities, by making astronomers aware of both the promise of ML and to understand its limitations.

Beyond astronomy, ML has many applications in science and a wide range of other fields. The skills developed by astronomers as they investigate and implement ML techniques will also serve them in cross-disciplinary endeavours, and will be an excellent way for Astro grad students to enhance their skill sets for non-astronomy career paths.

Our session will start with a broad overview talk by Mario Juric (UWash), followed by a description of ML in practical use by James Davenport (UWash). These will be followed by three contributed talks, by Dan Patnaude (CfA), Brigitta Sipocz (DIRAC) and Marc Huertas-Company (LERMA).

This special session will be followed by a panel discussion on Astroinformatics and Astrostatistics in the age of Big Data.


Chair: V. Kashyap (CfA)

[2:00pm-2:27pm] Mario Juric (University of Washington)

Machine learning applications with LSST: From Data Processing to Knowledge Discovery

Abstract: The Large Synoptic Survey Telescope (LSST; will be the most comprehensive optical astronomy project ever undertaken. The LSST will take panoramic images of the entire visible sky twice each week for 10 years, building up the deepest, widest, image of the Universe. The resulting hundreds of petabytes of imaging data for close to 40 billion objects will be used for scientific investigations ranging from the properties of near-Earth asteroids to characterizations of dark matter and dark energy. The volume, quality, and the real-time aspects of the LSST survey present significant research opportunities. They will enable studies of entire populations of objects, detections of faint statistical signals, and real-time discovery and follow-up of rare phenomena. Yet at the same time, these characteristics make it a difficult dataset to process and examine using classical techniques. In this talk, I will discuss the challenges presented by the LSST data set and areas where machine learning techniques are expected to be helpful. This includes the generation of well-characterized alert streams, to applications in data anslysis and knowledge discovery. Present-day surveys such as the PTF, CRTS, and ZTF have already shown how machine learning can be an effective way to extract knowledge from astronomical data sets and streams. In the LSST era, we expect them to continue to grow in importance.


[2:27pm-2:52pm] James Davenport (University of Washington)

A Typical User's Ground-Level Perspective on Machine Learning in Astronomy

Abstract: We have entered an era in observational astronomy in which sky surveys routinely release massive datasets. While this wealth of data is critical for determining rates of rare phenomena (e.g. transiting exoplanets or tidal disruption events), it also enables a new kind of data-driven astrophysics (e.g. "hidden" correlations in our data that point towards new or challenging undetandings of physics). Machine learning is simply one tool available to us to discover these new trends or make predictions from our growing volume of data. However, machine learning alone cannot make astrophysical discoveries, and astronomers are still required to interpret astrophysical meaning from our data. Here I will discuss some uses of machine learning in analyzing data from the Kepler and Gaia missions, and attempt to highlight some of the opportunities and limitations in its use.

Presentation slides [.pdf]
YouTube video of talk [url]

[2:52pm-3:03pm] Dan Patnaude (Smithsonian Astrophysical Observatory)

Classifying Supernova Remnant Spectra with Machine Learning

Abstract: There is a clear connection between the evolutionary properties of a massive star and the properties of the resultant supernova and supernova remnant. Here we present new results where we have modeled 45,000 supernova remnants to ages of 5000 years, and synthesized spectra for both shocked circumstellar material and shocked ejecta at 10 epochs across the life of the remnant. We then used the 900,000 synthetic spectra to train and test a machine learning algorithm in classifying the spectra, in order to make concrete inferences about the progenitor evolution. We then applied these models to the population of Galactic and Magellanic Cloud core collapse remnants in order to understand the properties of their progenitor systems.

Presentation slides [.pdf]

[3:03pm-3:14pm] Brigitta Sipocz (DIRAC Institute, University of Washington)

astroML 2.0 - Machine Learning for Astrophysics

Abstract: We present the roadmap and updates for the second edition of astroML (, a popular open source machine-learning library for astrophysics. astroML provides a publicly available repository for fast Python implementations of statistical routines for astronomy, as well as examples of astrophysical data analyses using techniques from statistics and machine learning. The new version further develops astroML into a general machine learning toolkit for the next generation of astrophysical surveys. New components to be included are algorithms for approximate Bayesian computation, hierarchical Bayes, and modifying the regression and regularization code to account for uncertainties within the data. We will also incorporate an interface to deep learning algorithms. Our objective is to ensure astroML scales well when working with large datasets and it exploits multicore and multiprocessing hardware. Astronomical data provide a popular testbed for developing methods applicable throughout the physical and life sciences and astroML has already been used widely beyond astronomy in other areas from cancer research and analysis of the securities market to teach data science in astronomy.

Presentation slides [.pdf]

[3:14pm-3:30pm] Marc Huertas-Company (LERMA, Observatoire de Paris)

Investigating Galaxy Evolution with Deep Learning

Abstract: Deep learning is rapidly becoming a standard tool in many scientific disciplines including astronomy. I will review recent and on-going work on several applications of deep learning techniques to galaxy evolution related problems. I will show examples of how different network configurations can be efficiently used to classify galaxies into different evolutionary stages even when no apparent features are visible as well as to detect and measure substructures within galaxies such as bulges and clumps. I will also discuss usnupervised approaches based on generative models to compare numerical simulations and observations and detect anomalous objects. In my talk I will also try to show possible solutions to known limitations such as uncertainty estimation, small training sets and the "black box problem".

Presentation slides [.pdf]


Rosanne Di Stefano (rdistefano @ cfa . harvard . edu)
Vinay Kashyap (vkashyap @ cfa . harvard . edu)
Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)


2018-jul-05: started page.
2018-aug-06: Date and time set.
2018-aug-20: Time corrected.
2018-aug-23: changed URL, in anticipation of there also being a splinter session
2018-oct-26: added abstracts from all speakers
2018-nov-01: added times for all speakers
2018-nov-13: CfA has a new logo
2018-nov-15: Session number and venue are announced
2018-nov-27: Apparently Firefox does not support the <wbr> tag anymore
2019-jan-06: Reversed order of contrib talks
2019-jan-11: Added talk slides from Jim Davenport and Dan Patnaude
2019-jan-14: Added talk slides from Marc Huertas-Company and Brigitta Sipocz
2019-feb-04: Added !yt video link for Jim Davenport talk

AAS233 / CfA / CHASC

Machine Learning in Astronomical Data Analysis
7 Jan 2019
2:00pm-3:30pm PST
Room 607


Mario Juric
James Davenport
Dan Patnaude
Brigitta Sipocz
Marc Huertas-Company