]]>A tremendous amount of information is contained within the temporal variations of various measurable quantities, such as the energy distributions of the incident photons, the overall intensity of the source, and the spatial coherence of the variations. While the detection and interpretation of periodic variations is well studied, the same cannot be said for non-periodic behavior in a multi-dimensional domain. Methods to deal with such problems are still primitive, and any attempts at sophisticated analyses are carried out on a case-by-case basis. Some of the issues we seek to focus on are methods to deal with are:

* Stochastic variability

* Chaotic Quasi-periodic variability

* Irregular data gaps/unevenly sampled data

* Multi-dimensional analysis

* Transient classificationOur goal is to present some basic questions that require sophisticated temporal analysis in order for progress to be made. We plan to bring together astronomers and statisticians who are working in many different subfields so that an exchange of ideas can occur to motivate the development of sophisticated and generally applicable algorithms to astronomical time series data. We will review the problems and issues with current methodology from an algorithmic and statistical perspective and then look for improvements or for new methods and techniques.

]]>The Future of Scientific Knowledge Discovery in Open Networked Environments

http://sites.nationalacademies.org/PGA/brdi/PGA_060422New York Workshop on Computer, Earth, and Space Sciences 2011

http://www.giss.nasa.gov/meetings/cess2011/Innovations in Data-Intensive Astronomy

http://www.nrao.edu/meetings/bigdata/Astrostatistics and Data Mining in Large Astronomical Databases

http://www.iwinac.uned.es/Astrostatistics/Statistical Challenges in Modern Astronomy V (including summer school & tutorials)

http://astrostatistics.psu.edu/su11scma5/Very Wide Field Surveys in the Light of Astro2010

http://widefield2011.pha.jhu.edu/Statistical Methods for Very Large Datasets

http://www.regonline.com/builder/site/Default.aspx?eventid=75763323rd Scientific and Statistical Database Management Conference

http://ssdbm2011.ssdbm.org/International Statistical Institute (ISI) World Congress

http://www.isi2011.ie/NASA Conference on Intelligent Data Understanding

https://c3.ndc.nasa.gov/dashlink/projects/43/

- Summer School in Statistics for Astronomers VII (June 6-10, 2011)
- Pre-conference Tutorials (June 11-12, 2011)
- Statistical Challenges in Modern Astronomy V (June 13-17, 2011)
***

**Web site: **http://astrostatistics.psu.edu/su11scma5/*

Registration is now open until May 6

(Summer School registration may close earlier if the enrollment limit is reached)

*Contributed papers for the SCMA V conference are welcome*

**Summer School in Statistics for Astronomers**: The seventh summer school is an intensive week covering basic statistical inference, several fields of applied statistics, and hands-on experience with the R computing environment. Topics include: exploratory data analysis, hypothesis testing, parameter estimation, regression, bootstrap resampling, model selection & goodness-of-fit, maximum likelihood and Bayesian methods, nonparametrics, spatial processes, and times series. Instructors are mostly faculty members in statistics.

** Pre-conference tutorials**: Instruction in four areas of astrostatistical interest presented during the weekend between the Summer School and SCMA V conference. Topics are: Bayesian computation and MCMC; data mining; R for astronomers; and wavelets for image analysis. Instructors are members of the SCMA V Scientific Organizing Committee.

**SCMA V conference**: Held every five years, SCMA conferences are the premier cross-disciplinary forum for research statisticians and astronomers to discuss methodological issues of mutual interest. Session topics include: statistical modeling in astronomy, Bayesian analysis across astronomy; Bayesian cosmology; data mining and informatics; sparsity; interpreting astrophysical simulations; time domain astronomy; spatial and image analysis; and future directions for astrostatistics. Invited lectures will be followed by cross-disciplinary commentaries. The conference welcomes contributed papers from statisticians and astronomers.

* Visit **http://astrostatistics.psu.edu/su11scma5/** for more information and registration*

Contacts:

Eric Feigelson, Dept. of Astronomy & Astrophysics, Penn State, edf@astro.psu.edu

G. Jogesh Babu, Dept. of Statistics, Penn State, babu@stat.psu.edu

]]>This will be one of the better years for Perseids; the moon, which often interferes with the Perseids, will not be a problem this year. So I’m putting together something that’s never been done before: a spatial analysis of the Perseid meteor stream. We’ve had plenty of temporal analyses, but nobody has ever been able to get data over a wide area — because observations have always been localized to single observers. But what if we had hundreds or thousands of people all over North America and Europe observing Perseids and somebody collected and collated all their observations? This is crowd-sourcing applied to meteor astronomy. I’ve been working for some time on putting together just such a scheme. I’ve got a cute little Java applet that you can use on your laptop to record the times of fall of meteors you see, the spherical trig for analyzing the geometry (oh my aching head!) and a statistical scheme that I *think* will reveal the spatial patterns we’re most likely to see — IF such patterns exist. I’ve also got some web pages describing the whole shebang. They start here:

http://www.erasmatazz.com/page78/page128/PerseidProject/PerseidProject.html

I think I’ve gotten all the technical, scientific, and mathematical problems solved, but there remains the big one: publicizing it. It won’t work unless I get hundreds of observers. That’s where you come in. I’m asking two things of you:

1. Any advice, criticism, or commentary on the project as presented in the web pages.

2. Publicizing it. If we can get that ol’ Web Magic going, we could get thousands of observers and end up with something truly remarkable. So, would you be willing to blog about this project on your blog?

3. I would be especially interested in your comments on the statistical technique I propose to use in analyzing the data. It is sketched out on the website here:http://www.erasmatazz.com/page78/page128/PerseidProject/Statistics/Statistics.html

Given my primitive understanding of statistical analysis, I expect that your comments will be devastating, but if you’re willing to take the time to write them up, I’m certainly willing to grit my teeth and try hard to understand and implement them.

Thanks for any help you can find time to offer.

Chris Crawford

Phillips Auditorium, CfA,

60 Garden St., Cambridge, MA 02138

URL: http://hea-www.harvard.edu/AstroStat/CAS2010

]]>The California-Boston-Smithsonian Astrostatistics Collaboration plans to host a mini-workshop on Computational Astro-statistics. With the advent of new missions like the Solar Dynamic Observatory (SDO), Panoramic Survey and Rapid Response (Pan-STARRS) and Large Synoptic Survey (LSST), astronomical data collection is fast outpacing our capacity to analyze them. Astrostatistical effort has generally focused on principled analysis of individual observations, on one or a few sources at a time. But the new era of data intensive observational astronomy forces us to consider combining multiple datasets and infer parameters that are common to entire populations. Many astronomers really want to use every data point and even non-detections, but this becomes problematic for many statistical techniques.

The goal of the Workshop is to explore new problems in Astronomical data analysis that arise from data complexity. Our focus is on problems that have generally been considered intractable due to insufficient computational power or inefficient algorithms, but are now becoming tractable. Examples of such problems include: accounting for uncertainties in instrument calibration; classification, regression, and density estimations of massive data sets that may be truncated and contaminated with measurement errors and outliers; and designing statistical emulators to efficiently approximate the output from complex astrophysical computer models and simulations, thus making statistical inference on them tractable. We aim to present some issues to the statisticians and clarify difficulties with the currently used methodologies, e.g. MCMC methods. The Workshop will consist of review talks on current Statistical methods by Statisticians, descriptions of data analysis issues by astronomers, and open discussions between Astronomers and Statisticians. We hope to define a path for development of new algorithms that target specific issues, designed to help with applications to SDO, Pan-STARRS, LSST, and other survey data.

We hope you will be able to attend the workshop and present a brief talk on the scope of the data analysis problem that you confront in your project. The workshop will have presentations in the morning sessions, followed by a discussion session in the afternoons of both days.

(via /.)

]]>First Announcement

Summer School in Statistics for Astronomers VI

June 7-12, 2010

with a supplement on Statistics and Computation for Astronomical Surveys

June 12-14, 2010

Registration Deadline: May 3, 2010 or when the enrollment limit reaches.

Penn State Universityhttp://astrostatistics.psu.edu/su10/

The sixth annual Penn State Summer School in Statistics for Astronomers will be held at Penn State. The main part of the School is a 6-day course (June 7-12, 2010) in fundamental statistical inference designed to provide researchers and graduate students in the physical sciences with a strong conceptual foundation in modern statistics. We develop a repertoire of well-established techniques applicable to observational astronomy and physics. Classroom instruction is interspersed with hands-on analysis of astronomical data using the open-source R software package. The course is taught by a team of statistics and astronomy professors with opportunity for discussion of methodological issues. The program starts on Monday morning (June 7, 2010), and ends on Saturday June 12, 2010 at noon. The topics covered include:

* Exploratory data analysis

* Hypothesis testing and parameter estimation

* Regression

* Bootstrap resampling

* Model selection & goodness-of-fit

* Maximum likelihood methods & Bayes’ Theorem

* Non-parametric methods

* Monte Carlo methods

* Poisson processes

* Time series

The 2010 Summer School will be modeled on the last four Penn State Summer Schools and the two Indian Institute of Astrophysics-Penn State Summer School; see 2005, 2006, 2007, 2008 and 2009 lecture notes for the Penn State Summer Schools.

This is immediately followed by a supplementary program (June 12-14, 2010) on Statistics and Computation for Astronomical Surveys. This program starts on Saturday June 12 immediately following the main school and ends on Monday June 14 at noon. Statistical topics covered will include:

* Number count distributions (“logN-logS”) and the fundamental equation of stellar statistics

* Selection effects: truncation and censoring (Lynden-Bell, Kaplan-Meier product limit estimators)

* Classical survey biases (Eddington, Malmquist, Lutz-Kelker)

* Population modeling with hierarchical models

* Statistical cross-matching between surveys

* Introduction to Virtual Observatory software tools for querying and analyzing survey data

Participants may register for one or both programs. There is limited financial support for the program on astronomical surveys; requests for support should be sent to Tom Loredo (loredo, at astro.cornell.edu) by May 3.

]]>http://members.aas.org/JobReg/JobDetailPage.cfm?JobID=26225

]]>A postdoctoral position is available at the University of California, Berkeley for an individual who can lead an effort in real-time classification of astronomical time-series data for the purpose of extraction of novel science. The project is sponsored by a new Cyber-enabled Discovery and Innovation (CDI) grant from the National Science Foundation (NSF; http://128.150.4.107/awardsearch/showAward.do?AwardNumber=0941742 ).

The main goal of this project it to produce a framework (including new theoretical/algorithmic constructs) for extracting novel science from large amounts of data in an environment where the computational needs vastly outweigh the available facilities, and intelligent (as well as dynamic) resource allocation is required. This work will draw from current research in statistics, database engineering, computational science, time-domain astronomy, and machine learning and is expected to lead to applications beyond astronomy. The collaboration has access to proprietary astronomical datasets. We hope to build a system eventually capable of ingesting, assimilating, and creating “new knowledge” from massive data streams expected from new projects, such as the Large Synoptic Survey Telescope. The collaboration also has access to large-scale computing facilities through the Center for Information Technology Research in the Interest of Society (CITRIS) at Berkeley, at Lawrence Berkeley National Laboratory (LBNL), and and through cloud computing time donated by industry partners.

This work will be directed by Prof. Joshua Bloom in the Astronomy Department but the position calls for strong interactions with other senior members of the collaboration in other departments (Martin Wainwright, EECS and Statistics; Nourredine El Kouroui, Statistics; John Rice, Statistics; Massoud Nikravesh, CITRIS; Peter Nugent, LBNL; Horst Simon, LBNL). Experience and a demonstrated interest working with graduate students across these disciplines is also encouraged.

Minimum qualifications include a Ph.D. in Computer Science, Electrical Engineering, Statistics, Astronomy or closely related field is required. The strongest candidates will have demonstrated success in conducting original research in statistics and/or machine learning and should have a deep understanding and/or interest in topics of time-domain Astronomy. Work will commence no later than 1 August 2010. The appointment may start on an earlier date, if mutually convenient (funding is already available to start as early as Spring 2010). The initial appointment is for two years, with renewal expected if progress is satisfactory and funds continue to be available. The starting salary will be commensurate with experience, and competitive with other postdoctoral positions. Please e-mail a short research statement, resume, list of publications, and copies of two recent publications (preprints or reprints) so that they arrive by the 1 February 2010 deadline to Prof. Joshua Bloom, at the above address. To receive full consideration, applicants should arrange to have letters of references from three individuals sent to Prof. Bloom by the 1 February 2010 due date (letters may also be emailed directly by the referees). Immigration status of non-citizens should be stated in the resume.

There are quite many websites dedicated to python as you already know. Some of them talk only to astronomers. A tiny fraction of those websites are for statisticians but I haven’t met any statistician preferring only python. We take the gist of various languages. So, I’ll leave a general website aggregation, such as AstroPy (I think this website is extremely useful for astronomers), to enrich your bookmark under the “python” tab regardless of your profession. Instead, I’ll discuss some python libraries and modules that can be useful for those exercising astrostatistics and make their work easier. I must say that by intention I omitted a few modules because I was not sure their publicity and copyright sensitivity. If you have modules that can be introduced publicly, let me know. I’ll be happy to add them. If my description is improper and want them to be taken off, also let me know.

Over the past few years, python became the most common and versatile script language for both communities, and therefore, I believe, it would accelerate many collaborations. Much of my time is spent to find out how to read, maneuver, and handle raw data/image. Most of tactics for astronomers are quite unfamiliar, sometimes insensible to me (see my read.table() and data analysis system and its documentation). Somehow, one script language, thanks to its open and free intention to all communities, is promising by narrowing the gap for prosperous and efficient collaborations, **Python**

The first posting on this slog was about __Python.__ I thought that kicking off with a computer language relatively new and open to many communities could motivate me and others for more interdisciplinary works with diversity. After a few years, unfortunately, I didn’t achieve that goal. Yet, I still think that these libraries and modules, introduced below, to be useful for your transition from some programming languages, or for writing your own but pro bono wrapper for better communication with the others.

I’ll take numpy, scipy, and RPy for granted. For the plotting purpose, matplotlib seems most common.

**Reading astronomical data** (click links to download libraries, modules, and tutorials)

- First, start with Using Python for Interactive Data Analysis (in pdf) Quite useful manual, particularly for IDL users. It compares pros and cons of Python and IDL.
- IDLsave Simply, without IDL, a .save file becomes legible. This is a brilliant small module.
- PyRAF (I was really frustrated with IRAF and spent many sleepless nights. Apart from data reduction, I don’t remember much of statistics from IRAF except simple statistics for Gaussian populations. I guess PyRAF does better job). And there’s PyFITS for handling fits format data.
- APLpy (the Astronomical Plotting Library in Python) is a Python module aimed at producing publication-quality plots of astronomical imaging data in FITS format (this introduction is copied from the APLpy site).

**Statistics, Mathematics, or data science**

Due to RPy, introducing smaller modules seems not much worthy but quite many modules and library for statistics are available, not relying on R.

- MDP (Modular toolkit for Data Processing)

Multivariate data analysis methods like PCA, ICA, FA, etc. become very popular in the astronomical society. - pywavelets (Not only FT, various transformation methodologies are often used and wavelet transformation ranks top).
- PyIMSL (see my post, PyIMSL)
- PyMC I introduced this module in a century ago. It may be lack of versatility or robustness due to parametric distribution objects but I liked the tutorial very much from which one can expand and devise their own working MCMC algorithm.
- PyBUGS (I introduced this python wrapper in BUGS but the link to PyBUGS is not working anymore. I hope it revives.)
- SAGE (Software for Algebra and Geometry Experimentation) is a free open-source mathematics software system licensed under the GPL (Link to the online tutorial).
- python_statlib descriptive statistics for the python programming language.
- PYSTAT Nice website but the product is not available yet. Be aware! It is not PhyStat!!!

**Module for AstroStatistics**

**import inference** (Unfortunately, the links to examples and tutorial are not available currently)

Without clear objectives, it is not easy to pick up a new language. If you are used to work with one from alphabet soup, you most likely adhere to your choice. Changing alphabets or transferring language names only happens when your instructor specifically ask you to use their preferring languages and when analysis {modules, libraries, tools} are only available within that preferred language. Somehow, thanks to the object oriented style, python makes transition and communication easier than other languages. Furthermore, script languages are more intuitive and better interpretable.

]]>