Last Updated: 20140530
AAS 224 : Special Session
Topics in AstroStatistics
10:00 AM - 11:30 AM, June 2, 2014
St. George AB, Westin Copley Place, Boston, MA
Modern AstroStatistics has emerged as a new field in recent times, informed by a Bayesian foundation, utilizing powerful computational tools like MCMC, and applied to diverse analysis and inference problems in Astronomy.
The use of statistics to validate results and evaluate models is pervasive in modern astronomy and advanced statistical techniques have been extensively applied across cosmology, high-energy astrophysics, solar physics, large surveys, etc.
Poisson techniques have proliferated, and the use of MCMC is common today.
The diversity of astronomical data has also had positive feedback on Statistics, and has influenced the development of new algorithms and insights.
Now, new methods of data collection, and the rapidly increasing amounts of data that are collected (the "Big Data" problem), pose new challenges of analysis and interpretation.
The goal of our Special Session is to review current practices, highlight modern techniques, and explore how the transition into the realm of Big Data Astronomy can be facilitated.
Analysis and interpretation of the terabyte and petabyte data streams from forthcoming surveys and missions is a major challenge to the practice of Astronomy, and possibly requires a significant restructuring of the types of problems that are addressed.
A secondary purpose of this Session is to build broad community awareness about the risks and rewards of principled analysis.
We will explore the pitfalls of using advanced and powerful techniques as black boxes, and will highlight the complexities that arise from implementations and broad usage among Astronomers.
The session will be complemented by an informal interactive discussion session hosted at the Chandra Booth as part of a program initiated by the SAO/Harvard-based CHASC AstroStatistics Collaboration.
CHASC will arrange for Statisticians to be available at specified times to answer questions and discuss AstroStatistical issues with Astronomers. 
Special Session (Mon, Jun 2, 10-11:30am):
-  Chair: Aneta Siemiginowska (CfA)
	
-   
-  105.01. Towards Good Statistical Practices in Astronomical Studies
	
-  Eric Feigelson (PennState)
	-  Abstract:
 Astronomers do not receive strong training in statistical
	 methodology and are therefore sometimes prone to analyze
	 data in ways that are discouraged by modern statisticians.
	 A number of such cases are reviewed involving the
	 Kolmogorov-Smirnov test, histograms and other binned
	 statistics, various issues with regression, model selection
	 with the likelihood ratio test, over-reliance on `3-sigma'
	 criteria, under-use of multivariate clustering algorithms,
	 and other issues.
-   
-  105.02. Big Computing in Astronomy: Perspectives and Challenges
	
-  Viktor Pankratius (MIT)
	-  Abstract:
 Hardware progress in recent years has led to astronomical
	instruments gathering large volumes of data.  In radio
	astronomy for instance, the current generation of antenna
	arrays produces data at Tbits per second, and forthcoming
	instruments will expand these rates much further.  As
	instruments are increasingly becoming software-based,
	astronomers will get more exposed to computer science.  This
	talk therefore outlines key challenges that arise at the
	intersection of computer science and astronomy and presents
	perspectives on how both communities can collaborate to
	overcome these challenges.  Major problems are emerging due
	to increases in data rates that are much larger than in
	storage and transmission capacity, as well as humans being
	cognitively overwhelmed when attempting to opportunistically
	scan through Big Data. As a consequence, the generation of
	scientific insight will become more dependent on automation
	and algorithmic instrument control. Intelligent data reduction
	will have to be considered across the entire acquisition
	pipeline. In this context, the presentation will outline
	the enabling role of machine learning and parallel computing.
-  Bio:
 Victor Pankratius is a computer scientist who joined MIT
	Haystack Observatory following his passion for astronomy.
	He is currently leading efforts to advance astronomy through
	cutting-edge computer science and parallel computing. Victor
	is also involved in projects such as ALMA Phasing to enhance
	the ALMA Observatory with Very-Long Baseline Interferometry
	capabilities, the Event Horizon Telescope, as well as in
	the Radio Array of Portable Interferometric Detectors (RAPID)
	to create an analysis environment using parallel computing
	in the cloud. He has an extensive track record of research
	in parallel multicore systems and software engineering,
	with contributions to auto-tuning, debugging, and empirical
	experiments studying programmers. Victor has worked with
	major industry partners such as Intel, Sun Labs, and Oracle.
	He holds a distinguished doctorate and a Habilitation degree
	in Computer Science from the University of Karlsruhe. Contact
	him at pankrat@mit.edu, victorpankratius.com, or Twitter
	@vpankratius.
-   
-  105.03. The Full Monte Carlo: A Live Performance with Stars
	
-  Xiao-Li Meng (Harvard)
	-  Abstract:
 Markov chain Monte Carlo (MCMC) is being applied increasingly
	often in modern Astrostatistics. It is indeed incredibly
	powerful, but also very dangerous. It is popular because
	of its apparent generality (from simple to highly complex
	problems) and simplicity (the availability of out-of-the-box
	recipes). It is dangerous because it always produces
	something but there is no surefire way to verify or even
	diagnosis that the "something" is remotely close to what
	the MCMC theory predicts or one hopes. Using very simple
	models (e.g., conditionally Gaussian), this talk starts
	with a tutorial of the two most popular MCMC algorithms,
	namely, the Gibbs Sampler and the Metropolis-Hasting
	Algorithm, and illustrates their good, bad, and ugly
	implementations via live demonstration. The talk ends with
	a story of how a recent advance, the Ancillary-Sufficient
	Interweaving Strategy (ASIS) (Yu and Meng, 2011,
	http://www.stat.harvard.edu/Faculty_Content/meng/jcgs.2011-article.pdf)
	reduces the danger. It was discovered almost by accident
	during a Ph.D. student's (Yaming Yu) struggle with fitting
	a Cox process model for detecting changes in source intensity
	of photon counts observed by the Chandra X-ray telescope
	from a (candidate) neutron/quark star.
-   
Ask-A-Statistician (located at Chandra booth):
 | Mon, Jun 2 | Xiao-Li Meng 
 | Noon - 2 pm | 
 | Tue, Jun 3 | David Jones Yang Chen
 
 | 1:30 pm - 5:30 pm | 
 | Wed, Jun 4 | Keli Liu Yang Chen
 
 | 11:30 am - 3:30 pm | 
Vinay Kashyap (vkashyap @ cfa . harvard . edu)
Aneta Siemiginowska (asiemiginowska @ cfa . harvard . edu)
-  2014-may-30: set up page.
CHASC