Redistribution

vlk — Sat, 01 Nov 2008 16:41:48 +0000

RMF. It is a wørd to strike terror even into the hearts of the intrepid. It refers to the spread in the measured energy of an incoming photon, and even astronomers often stumble over what it is and what it contains. It essentially sets down the measurement error for registering the energy of a photon in the given instrument.

Thankfully, its usage is robustly built into analysis software such as Sherpa or XSPEC and most people don’t have to deal with the nitty gritty on a daily basis. But given the profusion of statistical software being written for astronomers, it is perhaps useful to go over what it means.

The Redistribution Matrix File (RMF) is, at its most basic, a description of how the detector responds to incoming photons. It describes the transformation from the photons that are impinging on the detector to the counts that are recorded by the instrument electronics. Ideally, one would want there to be a one-to-one mapping between the photon’s incoming energy and the recorded energy, but in the real world detectors are not ideal. The process of measuring the energy introduces a measurement error, which is encoded as the probability that incoming photons at energy E are read out in detector channels i. Thus, for each energy E, there results an array of probabilities p(i|E) such that the observed counts in channel i,
$$c_i|d_E \sim {\rm Poisson}(p(i|E) \cdot d_E) \,,$$
where d_E is the expected counts at energy E, and is the product of the source flux at the telescope and the effective area of the telescope+detector combination. Equivalently, the expected counts in channel i,
$${\rm E}(c_i|d_E) = p(i|E) \cdot d_E \,.$$

The full format of how the arrays p(i|E) are stored in files is described in a HEASARC memo, CAL/GEN/92-002a. Briefly, it is a FITS file with two tables, of which only the first one really matters. This first table (“SPECRESP MATRIX”) contains the energy grid boundaries {E_j; j=1..N_E} where each entry j corresponds to one set of p(i|E_j). The arrays themselves are stored in compressed form, as the smallest possible array that excludes all the zeros. An ideal detector, where $$p(i|E_j) \equiv \delta_{ij}$$ would be compressed to a matrix of size N_E × 1. The FITS extension also contains additional arrays to help uncompress the matrix, such as the index of the first non-zero element and the number of non-zero elements for each p(i|E_j).

The second extension (“EBOUNDS”) contains an energy grid {e_i; i=1..N_channels} that maps to the channels i. This grid is fake! Do not use it for anything except display purposes or for convenient shorthand! What it is is a mapping of the average detector gain to the true energy, such that it lists the most likely energy of the photons registered in that bin. This grid allows astronomers to specify filters to the spectrum in convenient units that are semi-invariant across instruments (such as [Å] or [keV]) rather than detector channel numbers, which are unique to each instrument. But keep in mind, this is a convenient fiction, and should never be taken seriously. It is useful when the width of p(i|E) spans only a few channels, and completely useless for lower-resolution detectors.

Reduced and Processed Data

vlk — Tue, 15 Jul 2008 03:55:09 +0000

Hyunsook recently said that she wished that there were “some astronomical data depositories where no data reduction is required but one can apply various statistical analyses to the data in the depository to learn and compare statistical methods”. With the caveat that there really is no such thing (every dataset will require case specific reduction; standard processing and reduction are inadequate in all but the simplest of cases), here is a brief list:

The 2 Megasecond Chandra observations of the Southern Deep Field, which have been processed, reduced, and mosaiced. (Coincidentally, just last week Peter Freeman had asked me for a nice dataset on which to try out spatial analysis algorithms, and for which some analysis already existed for people to check their results against. These were the data I recommended.)
HEASARC’s W3Browse gives direct access to archived data from various missions. The data products have been already processed in some standard way.
The Penn State Center for Astrostatistics maintains training datasets of various types
ADC, as pointed out by Brian

There are many more, I am sure, and if people find any particularly good ones, please point them out in the comments!

The AstroStat Slog » HEASARC

Redistribution

Reduced and Processed Data