The AstroStat Slog » Optical Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 [AAS-HEAD 2011] Time Series in High Energy Astrophysics Fri, 09 Sep 2011 17:05:33 +0000 vlk We organized a Special Session on Time Series in High Energy Astrophysics: Techniques Applicable to Multi-Dimensional Analysis on Sep 7, 2011, at the AAS-HEAD conference at Newport, RI. The talks presented at the session are archived at

A tremendous amount of information is contained within the temporal variations of various measurable quantities, such as the energy distributions of the incident photons, the overall intensity of the source, and the spatial coherence of the variations. While the detection and interpretation of periodic variations is well studied, the same cannot be said for non-periodic behavior in a multi-dimensional domain. Methods to deal with such problems are still primitive, and any attempts at sophisticated analyses are carried out on a case-by-case basis. Some of the issues we seek to focus on are methods to deal with are:
* Stochastic variability
* Chaotic Quasi-periodic variability
* Irregular data gaps/unevenly sampled data
* Multi-dimensional analysis
* Transient classification

Our goal is to present some basic questions that require sophisticated temporal analysis in order for progress to be made. We plan to bring together astronomers and statisticians who are working in many different subfields so that an exchange of ideas can occur to motivate the development of sophisticated and generally applicable algorithms to astronomical time series data. We will review the problems and issues with current methodology from an algorithmic and statistical perspective and then look for improvements or for new methods and techniques.

]]> 0
mini-Workshop on Computational AstroStatistics [announcement] Mon, 21 Jun 2010 16:25:31 +0000 chasc mini-Workshop on Computational Astro-statistics: Challenges and Methods for Massive Astronomical Data
Aug 24-25, 2010
Phillips Auditorium, CfA,
60 Garden St., Cambridge, MA 02138


The California-Boston-Smithsonian Astrostatistics Collaboration plans to host a mini-workshop on Computational Astro-statistics. With the advent of new missions like the Solar Dynamic Observatory (SDO), Panoramic Survey and Rapid Response (Pan-STARRS) and Large Synoptic Survey (LSST), astronomical data collection is fast outpacing our capacity to analyze them. Astrostatistical effort has generally focused on principled analysis of individual observations, on one or a few sources at a time. But the new era of data intensive observational astronomy forces us to consider combining multiple datasets and infer parameters that are common to entire populations. Many astronomers really want to use every data point and even non-detections, but this becomes problematic for many statistical techniques.

The goal of the Workshop is to explore new problems in Astronomical data analysis that arise from data complexity. Our focus is on problems that have generally been considered intractable due to insufficient computational power or inefficient algorithms, but are now becoming tractable. Examples of such problems include: accounting for uncertainties in instrument calibration; classification, regression, and density estimations of massive data sets that may be truncated and contaminated with measurement errors and outliers; and designing statistical emulators to efficiently approximate the output from complex astrophysical computer models and simulations, thus making statistical inference on them tractable. We aim to present some issues to the statisticians and clarify difficulties with the currently used methodologies, e.g. MCMC methods. The Workshop will consist of review talks on current Statistical methods by Statisticians, descriptions of data analysis issues by astronomers, and open discussions between Astronomers and Statisticians. We hope to define a path for development of new algorithms that target specific issues, designed to help with applications to SDO, Pan-STARRS, LSST, and other survey data.

We hope you will be able to attend the workshop and present a brief talk on the scope of the data analysis problem that you confront in your project. The workshop will have presentations in the morning sessions, followed by a discussion session in the afternoons of both days.

]]> 0
SDO launched Thu, 11 Feb 2010 19:04:00 +0000 vlk The Solar Dynamics Observatory, which promises a flood of data on the Sun, was launched today from Cape Kennedy.

]]> 0
Killer App Sun, 19 Oct 2008 03:27:19 +0000 vlk The iPhone is an amazing device. I have heard that some people use it as a phone, too, but it really is an extraordinary portable computer. It is faster and more powerful than the Sparcstations I used as a grad student, and will fit into your pocket. And most importantly, you can fit an entire planetarium on it.

There are many good planetarium programs that you can access on laptops, but it is really not that much fun to lug them around on camping trips or even out on to the roof at night. But now, thanks to the iPhone (and the iPod Touch) there has been a great leap forward.

The iTunes AppStore now has a number of astronomy themed apps, including apps that tell you the distance to the Moon correct to a meter. But the most impressive of the lot has to be the ones that produce skycharts and let you search for and find stars, constellations, and deep sky objects at any time, from anywhere. There are four such available now: Starmap, GoSkyWatch, iAstronomica, and iStellar.

I have only tried Starmap so far, and it is incredible. The developer says that there is a PRO version in the works, but this one is already plenty good for me.

It is quite well known that, unlike amateur astronomers, professional astronomers are quite ignorant of the night sky. Really, if someone turns us around to face North, we might figure out where Polaris is, but that’s it. Oh, and we can usually find the Moon. And daytime, we can point to where the Sun is, provided it is not cloudy, which though it often is in New England. True story: I still haven’t set eyes on the star which formed the basis of my PhD thesis (α Triangulum Australis; in my defence, it is only visible from the southern hemisphere). But all that is in the past, now I can rediscover my amateur roots, now I am feeling pretty confident that I can find anything, even dear old α TrA, all I need to do is cross the Equator and point with my tricorder.

]]> 4
Go Maroons! Wed, 27 Aug 2008 11:50:09 +0000 vlk UChicago, my alma mater, is doing alright for itself in the spacecraft naming business.

First there was Edwin Hubble (S.B. 1910, Ph.D. 1917).
Then came Arthur Compton (the “MetLab”).
Followed by Subramanya Chandrasekhar (Morton D. Hull Distinguished Service Professor of Theoretical Astrophysics).

And now, Enrico Fermi.

]]> 0
Magnitude [Eqn] Wed, 20 Aug 2008 17:00:34 +0000 vlk I still remember my first class as a new grad student. As a cocky Physics graduate, I was quite sure I knew plenty of astronomy. Astro 301, class 1, and it took all of 20 minutes of talk about stellar magnitudes to put that notion to permanent rest. So, for the sake of our stats colleagues, here’s a brief primer on one of the basic building blocks of astronomy.

For historical reasons, astronomers measure the brightness of celestial objects in rank order. The smaller the rank number, aka magnitude, the brighter the object. Thus, a star of the first magnitude is much brighter than a star of the sixth magnitude, and it would take exceptionally good eyes and a very dark sky to see a star of the seventh magnitude. Now, it turns out that the human eye perceives brightness on a log scale, so magnitudes are numerically similar to log(brightness). And because they are a ranking list, it is always with reference to a standard. After some rough calibration to match human perception to true brightness of stars in the night sky, we have a formal definition for magnitude,
$$m = – \frac{5}{2}\log_{10}\left(\frac{f_{object}}{f_{standard}}\right) \,,$$
where fobject is the flux from the object and fstandard is the flux from a fiducial standard. In the optical bands, the bright star Vega (α Lyrae) has been adopted as the standard, and has magnitudes of 0 in all optical filters. (Well, not exactly because Vega is not constant enough, and as a practical matter there is nowadays a hierarchy of photometric standard stars that are accessible at different parts of the sky.) Note that we can also write this in terms of the intrinsic luminosity Lobject of the object and the distance d to it,
$$m = – \frac{5}{2}\log_{10}\left(\frac{L_{object}}{4 \pi d^2}\frac{1}{f_{standard}}\right) \,.$$

Because astronomical objects are located at a vast variety of distances, it is useful to define an intrinsic magnitude of the object, independent of the distance. Thus, in contrast to the apparent magnitude m, which is the brightness at Earth, an absolute magnitude is defined as the brightness that would be perceived if the object were 10 parsecs away,
$$M \equiv m|_{d={\rm 10~pc}} = m – \frac{5}{2}\log_{10}\left(\frac{d^2}{{\rm (10~pc)}^2}\right) \equiv m – 5\log_{10}d + 5$$
where d is the distance to the object in [parsec], and the squared term is of course because of the inverse square law.

There are other issues such as interstellar absorption, cosmological corrections, extent of the source, etc., but let’s not complicate it too much right away.

Colors are differences in the magnitudes in different passbands. For instance, if the apparent magnitude in the blue filter is mB and in the green filter is mV (V for “visual”), the color is mB-mV and is usually referred to as “B-V” color. It is the difference in magnitudes, and is related to the log ratio of the intensities.

For an excellent description of what is involved in the measurement of magnitudes and colors, see this article on analyzing photometric data by Star Stryder.

]]> 0
Did they, or didn’t they? Tue, 20 May 2008 04:10:23 +0000 vlk Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways..

The first time around, Voss & Nelemans (arXiv:0802.2082) looked at how many X-ray sources there were around the candidate progenitor of SN2007on (they also looked at 4 more galaxies that hosted Type Ia SNe and that had X-ray data taken prior to the event, but didn’t find any other candidates), and estimated the probability of chance coincidence with the optical position. When you expect 2.2 X-ray sources/arcmin2 near the optical source, the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3%. This result has since been reported in Nature.

However, Roelofs et al. (arXiv:0802.2097) went about getting better optical positions and doing better bore-sighting, and as a result, they measured the the X-ray position accurately and also carried out Monte Carlo simulations to estimate the error on the measured location. And they concluded that the actual separation, given the measurement error in the location, is too large to be a chance coincidence, 1.18±0.27 arcsec. The probability that the two locations are the same of finding offsets in the observed range is ~1% [see Tom's clarifying comment below].

Well now, ain’t that a nice pickle?

To recap: there are so few X-ray sources in the vicinity of the supernova that anything close to its optical position cannot be a coincidence, BUT, the measured error in the position of the X-ray source is not copacetic with the optical position. So the question for statisticians now: which argument do you believe? Or is there a way to reconcile these two calculations?

Oh, and just to complicate matters, the X-ray source that was present 4 years ago had disappeared when looked for in December, as one would expect if it was indeed the progenitor. But on the other hand, a lot of things can happen in 4 years, even with astronomical sources, so that doesn’t really confirm a physical link.

]]> 5
The GREAT08 Challenge Fri, 29 Feb 2008 03:46:49 +0000 vlk Grand statistical challenges seem to be all the rage nowadays. Following on the heels of the Banff Challenge (which dealt with figuring out how to set the bounds for the signal intensity that would result from the Higgs boson) comes the GREAT08 Challenge (arxiv/0802.1214) to deal with one of the major issues in observational Cosmology, the effect of dark matter. As Douglas Applegate puts it:

We are organizing a competition specifically targeting the statistics and computer science communities. The challenge is to measure cosmic shear at a level sufficient for future surveys such as the Large Synaptic Survey Telescope. Right now, we’ve stripped out most of complex observational issues leaving a pure statistical inference problem. The competition kicks off this summer, but we want to give possible participants a chance to prepare.

The website will provide continual updates on the competition.

]]> 7
Spurious Sources Wed, 19 Sep 2007 18:21:57 +0000 vlk [arXiv:0709.2358] Cleaning the USNO-B Catalog through automatic detection of optical artifacts, by Barron et al.

Statistically speaking, “false sources” are generally in the domain of Type II Type I errors, defined by the probability of detecting a signal where there is none. But what if there is a clear signal, but it is not real?

In astronomical analysis, sources are generally defined with reference to the existing background, as point-fluctuations that exceed some significance threshold defined by the estimated background “in the vicinity”. The threshold is usually set such that we can tolerate “a few” false positives at borderline significance. But that ignores the effect of systematic deviations that can be caused by various instrumental features. Such things are common in X-ray images — window support structures, chip gaps, bad CCD columns, cosmic-ray hits, etc. Optical data are generally cleaner, but by no means immune to the problem. Barron et al. here describe how they have gone through the USNO-B catalog and have modeled and eliminated artifacts coming from diffraction spikes and telescope reflection halos of bright stars.

The bad news? More than 2.3% of the sources are flagged as spurious. Compare to the typical statistical significance at which the detection thresholds are set (usually >3sigma).

]]> 2
[ArXiv] SDSS DR6, July 23, 2007 Wed, 25 Jul 2007 17:46:38 +0000 hlee From arxiv/astro-ph:0707.3413
The Sixth Data Release of the Sloan Digital Sky Survey by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at Additionally, Catalog Archive Service (CAS) and
SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Part of my dissertation was about creating nonparametric multivariate analysis tools with convex hull peeling and I used SDSS DR4 to apply those convex hull peeling tools to explore celestial objects in the multidimensional color space without projections (dimension reduction). SDSS CAS might fulfill the needs of those who are looking for data sets to conduct

  • massive multivariate data analysis,
  • streaming data analysis (strictly, SDSS is not streaming but the data base is updated yearly by adding new observations and depending on memory, streaming data analysis can be easily simulated) and
  • application of his/her new machine learning and statistical multivariate analysis tools for new discoveries.

Particularly, thanks to whole northern hemisphere survey, interesting spatial statistics can be developed such as voronoi tessellation for spatial density estimation. It also provides a vast image reservoir as well as the catalog of massive multivariate spatial data.

Oh, by the way, the paper discusses changes and improvement in the recent data release. The SDSS DR6 includes the complete imaging of the Northern Galactic Cap and contains images and parameters of 287 million objects over 9583 deg^2, and 1.27 million spectra over 7425 deg^2. The photometric calibration has improved with uncertainties of 1% in g,r,i and 2% in u, significantly better than previous data releases. The method of spectrophotometric calibration has changed and resulted 0.35 mags brighter in the spectrophotometric scale. Two independent codes for spectral classifications and redshifts are available as well.

]]> 1