]]>A tremendous amount of information is contained within the temporal variations of various measurable quantities, such as the energy distributions of the incident photons, the overall intensity of the source, and the spatial coherence of the variations. While the detection and interpretation of periodic variations is well studied, the same cannot be said for non-periodic behavior in a multi-dimensional domain. Methods to deal with such problems are still primitive, and any attempts at sophisticated analyses are carried out on a case-by-case basis. Some of the issues we seek to focus on are methods to deal with are:

* Stochastic variability

* Chaotic Quasi-periodic variability

* Irregular data gaps/unevenly sampled data

* Multi-dimensional analysis

* Transient classificationOur goal is to present some basic questions that require sophisticated temporal analysis in order for progress to be made. We plan to bring together astronomers and statisticians who are working in many different subfields so that an exchange of ideas can occur to motivate the development of sophisticated and generally applicable algorithms to astronomical time series data. We will review the problems and issues with current methodology from an algorithmic and statistical perspective and then look for improvements or for new methods and techniques.

Phillips Auditorium, CfA,

60 Garden St., Cambridge, MA 02138

URL: http://hea-www.harvard.edu/AstroStat/CAS2010

]]>The California-Boston-Smithsonian Astrostatistics Collaboration plans to host a mini-workshop on Computational Astro-statistics. With the advent of new missions like the Solar Dynamic Observatory (SDO), Panoramic Survey and Rapid Response (Pan-STARRS) and Large Synoptic Survey (LSST), astronomical data collection is fast outpacing our capacity to analyze them. Astrostatistical effort has generally focused on principled analysis of individual observations, on one or a few sources at a time. But the new era of data intensive observational astronomy forces us to consider combining multiple datasets and infer parameters that are common to entire populations. Many astronomers really want to use every data point and even non-detections, but this becomes problematic for many statistical techniques.

The goal of the Workshop is to explore new problems in Astronomical data analysis that arise from data complexity. Our focus is on problems that have generally been considered intractable due to insufficient computational power or inefficient algorithms, but are now becoming tractable. Examples of such problems include: accounting for uncertainties in instrument calibration; classification, regression, and density estimations of massive data sets that may be truncated and contaminated with measurement errors and outliers; and designing statistical emulators to efficiently approximate the output from complex astrophysical computer models and simulations, thus making statistical inference on them tractable. We aim to present some issues to the statisticians and clarify difficulties with the currently used methodologies, e.g. MCMC methods. The Workshop will consist of review talks on current Statistical methods by Statisticians, descriptions of data analysis issues by astronomers, and open discussions between Astronomers and Statisticians. We hope to define a path for development of new algorithms that target specific issues, designed to help with applications to SDO, Pan-STARRS, LSST, and other survey data.

We hope you will be able to attend the workshop and present a brief talk on the scope of the data analysis problem that you confront in your project. The workshop will have presentations in the morning sessions, followed by a discussion session in the afternoons of both days.

There are many good planetarium programs that you can access on laptops, but it is really not that much fun to lug them around on camping trips or even out on to the roof at night. But now, thanks to the iPhone (and the iPod Touch) there has been a great leap forward.

The iTunes AppStore now has a number of astronomy themed apps, including apps that tell you the distance to the Moon correct to a *meter*. But the most impressive of the lot has to be the ones that produce skycharts and let you search for and find stars, constellations, and deep sky objects at any time, from anywhere. There are four such available now: Starmap, GoSkyWatch, iAstronomica, and iStellar.

I have only tried Starmap so far, and it is incredible. The developer says that there is a PRO version in the works, but this one is already plenty good for me.

It is quite well known that, unlike amateur astronomers, professional astronomers are quite ignorant of the night sky. Really, if someone turns us around to face North, we might figure out where Polaris is, but that’s it. Oh, and we can usually find the Moon. And daytime, we can point to where the Sun is, provided it is not cloudy, which though it often is in New England. True story: I still haven’t set eyes on the star which formed the basis of my PhD thesis (α Triangulum Australis; in my defence, it is only visible from the southern hemisphere). But all that is in the past, now I can rediscover my amateur roots, now I am feeling *pretty* confident that I can find anything, even dear old α TrA, all I need to do is cross the Equator and point with my tricorder.

First there was Edwin Hubble (S.B. 1910, Ph.D. 1917).

Then came Arthur Compton (the “MetLab”).

Followed by Subramanya Chandrasekhar (Morton D. Hull Distinguished Service Professor of Theoretical Astrophysics).

And now, Enrico Fermi.

]]>For historical reasons, astronomers measure the brightness of celestial objects in rank order. The smaller the rank number, aka magnitude, the brighter the object. Thus, a star of the first magnitude is much brighter than a star of the sixth magnitude, and it would take exceptionally good eyes and a very dark sky to see a star of the seventh magnitude. Now, it turns out that the human eye perceives brightness on a log scale, so magnitudes are numerically similar to *log*(brightness). And because they are a ranking list, it is always with reference to a standard. After some rough calibration to match human perception to true brightness of stars in the night sky, we have a formal definition for magnitude,

$$m = – \frac{5}{2}\log_{10}\left(\frac{f_{object}}{f_{standard}}\right) \,,$$

where *f _{object}* is the flux from the object and

$$m = – \frac{5}{2}\log_{10}\left(\frac{L_{object}}{4 \pi d^2}\frac{1}{f_{standard}}\right) \,.$$

Because astronomical objects are located at a vast variety of distances, it is useful to define an intrinsic magnitude of the object, independent of the distance. Thus, in contrast to the *apparent* magnitude *m*, which is the brightness at Earth, an *absolute* magnitude is defined as the brightness that would be perceived if the object were 10 parsecs away,

$$M \equiv m|_{d={\rm 10~pc}} = m – \frac{5}{2}\log_{10}\left(\frac{d^2}{{\rm (10~pc)}^2}\right) \equiv m – 5\log_{10}d + 5$$

where *d* is the distance to the object in **[parsec]**, and the squared term is of course because of the inverse square law.

There are other issues such as interstellar absorption, cosmological corrections, extent of the source, etc., but let’s not complicate it too much right away.

Colors are differences in the magnitudes in different passbands. For instance, if the apparent magnitude in the blue filter is *m _{B}* and in the green filter is

For an excellent description of what is involved in the __measurement__ of magnitudes and colors, see this article on analyzing photometric data by Star Stryder.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways..

The first time around, Voss & Nelemans (arXiv:0802.2082) looked at how many X-ray sources there were around the candidate progenitor of SN2007on (they also looked at 4 more galaxies that hosted Type Ia SNe and that had X-ray data taken prior to the event, but didn’t find any other candidates), and estimated the probability of chance coincidence with the optical position. When you expect 2.2 X-ray *sources/arcmin ^{2} * near the optical source, the probability of finding one within 1.3

However, Roelofs et al. (arXiv:0802.2097) went about getting better optical positions and doing better bore-sighting, and as a result, they measured the the X-ray position accurately and also carried out Monte Carlo simulations to estimate the error on the measured location. And they concluded that the actual separation, given the measurement error in the location, is too large to be a chance coincidence, 1.18±0.27 *arcsec*. The probability ~~that the two locations are the same~~ of finding offsets in the observed range is ~1% [see Tom's clarifying comment below].

Well now, ain’t that a nice pickle?

To recap: there are so few X-ray sources in the vicinity of the supernova that anything close to its optical position cannot be a coincidence, __BUT__, the measured error in the position of the X-ray source is not copacetic with the optical position. So the question for statisticians now: *which argument do you believe?* Or is there a way to reconcile these two calculations?

Oh, and just to complicate matters, the X-ray source that was present 4 years ago had disappeared when looked for in December, as one would expect if it was indeed the progenitor. But on the other hand, a lot of things can happen in 4 years, even with astronomical sources, so that doesn’t really confirm a physical link.

]]>]]>We are organizing a competition specifically targeting the statistics and computer science communities. The challenge is to measure cosmic shear at a level sufficient for future surveys such as the Large Synaptic Survey Telescope. Right now, we’ve stripped out most of complex observational issues leaving a pure statistical inference problem. The competition kicks off this summer, but we want to give possible participants a chance to prepare.

The website www.great08challenge.info will provide continual updates on the competition.

Statistically speaking, “false sources” are generally in the domain of ~~Type II~~ **Type I** errors, defined by the probability of detecting a signal where there is none. But what if there is a clear signal, but it is not real?

In astronomical analysis, sources are generally defined with reference to the existing background, as point-fluctuations that exceed some significance threshold defined by the estimated background “in the vicinity”. The threshold is usually set such that we can tolerate “a few” false positives at borderline significance. But that ignores the effect of systematic deviations that can be caused by various instrumental features. Such things are common in X-ray images — window support structures, chip gaps, bad CCD columns, cosmic-ray hits, etc. Optical data are generally cleaner, but by no means immune to the problem. Barron et al. here describe how they have gone through the USNO-B catalog and have modeled and eliminated artifacts coming from diffraction spikes and telescope reflection halos of bright stars.

The bad news? More than 2.3% of the sources are flagged as spurious. Compare to the typical statistical significance at which the detection thresholds are set (usually >3sigma).

]]>The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and

SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Part of my dissertation was about creating nonparametric multivariate analysis tools with convex hull peeling and I used SDSS DR4 to apply those convex hull peeling tools to explore celestial objects in the multidimensional color space without projections (dimension reduction). SDSS CAS might fulfill the needs of those who are looking for data sets to conduct

- massive multivariate data analysis,
- streaming data analysis (strictly, SDSS is not streaming but the data base is updated yearly by adding new observations and depending on memory, streaming data analysis can be easily simulated) and
- application of his/her new machine learning and statistical multivariate analysis tools for new discoveries.

Particularly, thanks to whole northern hemisphere survey, interesting spatial statistics can be developed such as voronoi tessellation for spatial density estimation. It also provides a vast image reservoir as well as the catalog of massive multivariate spatial data.

Oh, by the way, the paper discusses changes and improvement in the recent data release. The SDSS DR6 includes the complete imaging of the Northern Galactic Cap and contains images and parameters of 287 million objects over 9583 deg^2, and 1.27 million spectra over 7425 deg^2. The photometric calibration has improved with uncertainties of 1% in g,r,i and 2% in u, significantly better than previous data releases. The method of spectrophotometric calibration has changed and resulted 0.35 mags brighter in the spectrophotometric scale. Two independent codes for spectral classifications and redshifts are available as well.

]]>