The AstroStat Slog » Chandra

Everybody needs crampons

vlk — Fri, 30 Apr 2010 16:12:36 +0000

Sherpa is a fitting environment in which Chandra data (and really, X-ray data from any observatory) can be analyzed. It has just undergone a major update and now runs on python. Or allows python to run. Something like that. It is a very powerful tool, but I can never remember how to use it, and I have an amazing knack for not finding what I need in the documentation. So here is a little cheat sheet (which I will keep updating ~~as and when~~ if I learn more):

2010-apr-30: Aneta has setup a blogspot site to deal with simple Sherpa techniques and tactics: http://pysherpa.blogspot.com/

On Help:

In general, to get help, use: ahelp "something" (note the quotes)
Even more useful, type: ? wildcard to get a list of all commands that include the wildcard
You can also do a form of autocomplete: type TAB after writing half a command to get a list of all possible completions.

Data I/O:

To read in your PHA file, use: load_pha()
Often for Chandra spectra, the background is included in that same file. In any case, to read it in separately, use: load_bkg()
- Q: should it be loaded in to the same dataID as the source?
- A: Yes.
- A: When the background counts are present in the same file, they can be read in separately and assigned to the background via set_bkg('src',get_data('bkg')), so counts from a different file can be assigned as background to the current spectrum.
To read in the corresponding ARF, use: load_arf()
- Q: load_bkg_arf() for the background — should it be done before or after load_bkg(), or does it matter?
- A: does not matter
To read in the corresponding RMF, use: load_rmf()
- Q: load_bkg_rmf() for the background, and same question as above
- A: same answer as above; does not matter.
To see the structure of the data, type: print(get_data()) and print(get_bkg())
To select a subset of channels to analyze, use: notice_id()
To subtract background from source data, use: subtract()
To not subtract, to undo the subtraction, etc., use: unsubtract()
To plot useful stuff, use: plot_data(), plot_bkg(), plot_arf(), plot_model(), plot_fit(), etc.
(Q: how in god’s name does one avoid plotting those damned error bars? I know error bars are necessary, but when I have a spectrum with 8192 bins, I don’t want it washed out with silly 1-sigma Poisson bars. And while we are asking questions, how do I change the units on the y-axis to counts/bin? A: rumors say that plot_data(1,yerr=0) should do the trick, but it appears to be still in the development version.)

Fitting:

To fit model to the data, command it to: fit()
To get error bars on the fit parameters, use: projection() (or covar(), but why deliberately use a function that is guaranteed to underestimate your error bars?)
Defining models appears to be much easier now. You can use syntax like: set_source(ModelName.ModelID+AnotherModel.ModelID2) (where you can distinguish between different instances of the same type of model using the ModelID — e.g., set_source(xsphabs.abs1*powlaw1d.psrc+powlaw1d.pbkg))
To see what the model parameter values are, type: print(get_model())
To change statistic, use: set_stat() (options are various chisq types, cstat, and cash)
To change the optimization method, use: set_method() (options are levmar, moncar, neldermead, simann, simplex)

Timestamps:
v1:2007-dec-18
v2:2008-feb-20
v3:2010-apr-30

[MADS] Chernoff face

hlee — Thu, 02 Apr 2009 16:00:41 +0000

I cannot remember when I first met Chernoff face but it hooked me up instantly. I always hoped for confronting multivariate data from astronomy applicable to this charming EDA method. Then, somewhat such eager faded, without realizing what’s happening. Tragically, this was mainly due to my absent mind.

After meeting Prof. Herman Chernoff unexpectedly – I didn’t know he is Professor Emeritus at Harvard – the urge revived but I didn’t have data, still then. Alas, another absent mindedness: I don’t understand why I didn’t realize that I already have the data, XAtlas for trying Chernoff faces until today. Data and its full description is found from the XAtlas website (click). For Chernoff face, references suggested in Wiki:Chernoff face are good. I believe some folks are already familiar with Chernoff faces from a New York Times article last year, listed in Wiki (or a subset characterized by baseball lovers?).

Capella is a X-ray bright star observed multiple times for Chandra calibration. I listed 16 ObsIDs in the figures below at each face, among 18+ Capella observations (Last time when I checked Chandra Data Archive, 18 Capella observations were available). These 16 are high resolution observations from which various metrics like interesting line ratios and line to continuum ratios can be extracted. I was told that optically it’s hard to find any evidence that Capella experienced catastrophic changes during the Chandra mission (about 10 years in orbit) but the story in X-ray can’t be very different. In a dismally short time period (10 years for a star is a flash or less), Capella could have revealed short time scale high energy activities via Chandra. I just wanted to illustrate that Chernoff faces could help visualizing such changes or any peculiarities through interpretation friendly facial expressions (Studies have confirmed babies’ ability in facial expression recognitions). So, what do you think? Do faces look similar/different to you? Can you offer me astronomical reasons for why a certain face (ObsID) is different from the rest?

p.s. In order to draw these Chernoff faces, check descriptions of these R functions, faces() (yields the left figure) or faces2() (yields the right figure) by clicking on the function of your interest. There are other variations and other data analysis systems offer different fashioned tools for drawing Chernoff faces to explore multivariate data. Welcome any requests for plots in pdf. These jpeg files look too coarse on my screen.

p.p.s. Variables used for these faces are line ratios and line to continuum ratios, and the order of these input variables could change countenance but impressions from faces will not change (a face with distinctive shapes will look different than other faces even after the order of metrics/variables is scrambled or using different Chernoff face illustration tools). Mapping, say from an astronomical metric to the length of lips was not studied in this post.

p.p.p.s. Some data points are statistical outliers, not sure about how to explain strange numbers (unrealistic values for line ratios). I hope astronomers can help me to interpret those peculiar numbers in line/continuum ratios. My role is to show that statistics can motivate astronomers for new discoveries and to offer different graphics tools for enhancing visualization. I hope these faces motivate some astronomers to look into Capella in XAtlas (and beyond) in details with different spectacles, and find out the reasons for different facial expressions in Capella X-ray observations. Particularly, ObsID 1199 is most questionable to me.

Astroart Survey

vlk — Sun, 02 Nov 2008 12:42:01 +0000

Astronomy is known for its pretty pictures, but as Joe the Astronomer would say, those pretty pictures don’t make themselves. A lot of thought goes into maximizing scientific content while conveying just the right information, all discernible at a single glance. So the hardworkin folks at Chandra want your help in figuring out what works and how well, and they have set up a survey at http://astroart.cfa.harvard.edu/. Take the survey, it is both interesting and challenging!

Go Maroons!

vlk — Wed, 27 Aug 2008 11:50:09 +0000

UChicago, my alma mater, is doing alright for itself in the spacecraft naming business.

First there was Edwin Hubble (S.B. 1910, Ph.D. 1917).
Then came Arthur Compton (the “MetLab”).
Followed by Subramanya Chandrasekhar (Morton D. Hull Distinguished Service Professor of Theoretical Astrophysics).

And now, Enrico Fermi.

Reduced and Processed Data

vlk — Tue, 15 Jul 2008 03:55:09 +0000

Hyunsook recently said that she wished that there were “some astronomical data depositories where no data reduction is required but one can apply various statistical analyses to the data in the depository to learn and compare statistical methods”. With the caveat that there really is no such thing (every dataset will require case specific reduction; standard processing and reduction are inadequate in all but the simplest of cases), here is a brief list:

The 2 Megasecond Chandra observations of the Southern Deep Field, which have been processed, reduced, and mosaiced. (Coincidentally, just last week Peter Freeman had asked me for a nice dataset on which to try out spatial analysis algorithms, and for which some analysis already existed for people to check their results against. These were the data I recommended.)
HEASARC’s W3Browse gives direct access to archived data from various missions. The data products have been already processed in some standard way.
The Penn State Center for Astrostatistics maintains training datasets of various types
ADC, as pointed out by Brian

There are many more, I am sure, and if people find any particularly good ones, please point them out in the comments!

Grating Dispersion [Equation of the Week]

vlk — Wed, 11 Jun 2008 17:00:46 +0000

High-resolution astronomical spectroscopy has invariably been carried out with gratings. Even with the advent of the new calorimeter detectors, which can measure the energy of incoming photons to an accuracy of as low as 1 eV, gratings are still the preferred setups for hi-res work below energies of 1 keV or so. But how do they work? Where are the sources of uncertainty, statistical or systematic?

The basis of dispersion is Bragg’s Law, which relates the constructive interference pattern of diffraction from a regular structure to the wavelength of the incident photons. If you have photons of wavelength λ incident at an angle θ₀, going through a transmission grating (such as the LETGS or the HETGS on Chandra) which has rulings spaced at distances d, the resulting diffracted photons will constructively interfere at dispersion angles θ such that

mλ = d (sinθ – sinθ₀) ≈ dδθ

where m=0,±1,±2,…, and for small deviations, we can approximate sinθ≈θ.

Now here’s the neat trick. If the grating is curved, then the photons incident on different parts of it will be diffracted by different angles, and it turns out (long story, lots of geometry, perhaps another time) that (for small changes in the incident angles) these photons are brought to a focus on the opposite side of a circle whose diameter is equal to the radius of curvature D of the grating. This circle is called the Rowland Circle. If a detector is placed on the Rowland circle directly across from the grating, photons of different wavelengths will be dispersed to a focus at a lateral distance Y along the detector, following along the circumference of the Rowland circle:

Y = D (θ – θ₀) = D δθ = mλ D/d

Note that for m=0, all the photons come to a focus at the same point Y=0 irrespective of λ. For m≠0, the photons are dispersed to different distances depending on the dispersion order m and the wavelength. Therefore, for any non-zero order, the wavelength of a photon can be estimated simply by measuring the dispersion distance Y:

λ = (1/mD) Y d

Some detectors that have rudimentary intrinsic spectral resolution (like ACIS) can separate out the photons based on orders. Others (like HRC), will have all the orders overlapping and unseparable. In such cases, modeling must be done carefully, using response matrices that include multiple orders.

The nominal statistical uncertainty on the estimated wavelength can be written as a combination of the measurement uncertainties in the grating periodicity d and the uncertainty in the photon’s position wrt the 0^th-order:

δλ/λ = √‾ ( (δd/d)²+(δY/Y)²+(δD/D)²)

Generally, the uncertainties in the grating curvature δD and in the periodicity δd are quite negligible. The major contributor to the wavelength uncertainty is then due to the location, which is composed of two factors — the uncertainty in registering the location on the detector that the photon is deposited (at least ~1 pixel), and the size of the astrophysical source that is being observed, which for point sources is the same as the size of the PSF. The reciprocal of the fractional uncertainty in wavelength (or energy), λ/δλ, is called the Resolving Power. Good spectrometers in the X-ray regime nowadays reach resolving powers of thousands.

As an example, consider the Chandra LETGS, which has a Rowland diameter D=8637 mm, and a grating period d=0.99125±0.000086 μm. At say 100Å in the 1^st order (i.e., m=±1) δθ ≈ ±0.0100883^c ~ ±37.4 arcmin. Then Y=±87.1324 mm, for a plate scale of 1.148Å/mm. The HRMA on-axis PSF is approximately 5 HRC pixels (0.032 mm) wide, so the resolving power is ≈2700. The simple back-of-the-envelope calculation gives results quite close to the actual values.

PS: This was originally a slide in the Gratings talk at the 2^nd X-ray Astronomy School, 2002, Berkeley Springs, WV.

Beta Profile [Equation of the Week]

vlk — Wed, 04 Jun 2008 17:00:43 +0000

X-ray telescopes generally work by reflecting photons at grazing incidence. As you can imagine, even small imperfections in the mirror polishing will show up as huge roadbumps to the incoming photons, and the higher their energy, the easier it is for them to scatter off their prescribed path. So X-ray telescopes tend to have sharp peaks and fat tails compared to the much more well-behaved normal-incidence telescopes, whose PSFs (Point Spread Functions) can be better approximated as Gaussians.

X-ray telescopes usually also have gratings that can be inserted into the light path, so that photons of different energies get dispersed by different angles, and whose actual energies can then be inferred accurately by measuring how far away on the detector they ended up. The accuracy of the inference is usually limited by the width of the PSF. Thus, a major contributor to the LRF (Line Response Function) is the aforementioned scattering.

A correct accounting of the spread of photons of course requires a full-fledged response matrix (RMF), but as it turns out, the line profiles can be fairly well approximated with Beta profiles, which are simply Lorentzians modified by taking them to the power β –

where B(1/2,β-1/2) is the Beta function, and N is a normalization constant defined such that integrating the Beta profile over the real line gives the area under the curve as N. The parameter β controls the sharpness of the function — the higher the β, the peakier it gets, and the more of it that gets pushed into the wings. Chandra LRFs are usually well-modeled with β~2.5, and XMM/RGS appears to require Lorentzians, β~1.

The form of the Lorentzian may also be familiar to people as the Cauchy Distribution, which you get for example when the ratio is taken of two quantities distributed as zero-centered Gaussians. Note that the mean and variance are undefined for that distribution.

Mexican Hat [EotW]

vlk — Wed, 28 May 2008 17:00:38 +0000

The most widely used tool for detecting sources in X-ray images, especially Chandra data, is the wavelet-based wavdetect, which uses the Mexican Hat (MH) wavelet. Now, the MH is not a very popular choice among wavelet aficianados because it does not form an orthonormal basis set (i.e., scale information is not well separated), and does not have compact support (i.e., the function extends to inifinity). So why is it used here?

The short answer is, it has a convenient background subtractor built in, is analytically comprehensible, and uses concepts very familiar to astronomers. The last bit can be seen by appealing to Gaussian smoothing. Astronomers are (or were) used to smoothing images with Gaussians, and in a manner of speaking, all astronomical images already come presmoothed by PSFs (point spread functions) that are nominally approximated by Gaussians. Now, if an image were smoothed by another Gaussian of a slightly larger width, the difference between the two smoothed images should highlight those features which are prominent at the spatial scale of the larger Gaussian. This is the basic rationale behind a wavelet.

So, in the following, G(x,y;σ_x,σ_y,x_o,y_o) is a 2D Gaussian written in such that the scaling of the widths and the transposition of the function is made obvious. It is defined over the real plane x,y ε R² and for widths σ_x,σ_y. The Mexican Hat wavelet MH(x,y;σ_x,σ_y,x_o,y_o) is generated as the difference between the two Gaussians of different widths, which essentially boils down to taking partial derivatives of G(σ_x,σ_y) wrt the widths. To be sure, these must really be thought of as operators where the functions are correlated with a data image, so the derivaties must be carried out inside an integral, but I am skipping all that for the sake of clarity. Also note, the MH is sometimes derived as the second derivative of G(x,y), the spatial derivatives that is.

The integral of the MH over R² results in the positive bump and the negative annulus canceling each other out, so there is no unambiguous way to set its normalization. And finally, the Fourier Transform shows which spatial scales (k_x,y are wavenumbers) are enhanced or filtered during a correlation.

Background Subtraction [EotW]

vlk — Wed, 21 May 2008 17:00:32 +0000

There is a lesson that statisticians, especially of the Bayesian persuasion, have been hammering into our skulls for ages: do not subtract background. Nevertheless, old habits die hard, and old codes die harder. Such is the case with X-ray aperture photometry.

When C counts are observed in a region of the image that overlaps a putative source, and B counts in an adjacent, non-overlapping region that is mostly devoid of the source, the question that is asked is, what is the intensity of a source that might exist in the source region, given that there is also background. Let us say that the source has intensity s, and the background has intensity b in the first region. Further let a fraction f of the source overlap that region, and a fraction g overlap the adjacent, “background” region. Then, if the area of the background region is r times larger, we can solve for s and b and even determine the errors:

Note that the regions do not have to be circular, nor does the source have to be centered in it. As long as the PSF fractions f and g can be calculated, these formulae can be applied. In practice, f is large, typically around 0.9, and the background region is chosen as an annulus centered on the source region, with g~0.

It always comes as a shock to statisticians, but this is not ancient history. We still determine maximum likelihood estimates of source intensities by subtracting out an estimated background and propagate error by the method of moments. To be sure, astronomers are well aware that these formulae are valid only in the high counts regime ( s,C,B>>1, b>0 ) and when the source is well defined ( f~1, g~0 ), though of course it doesn’t stop them from pushing the envelope. This, in fact, is the basis of many standard X-ray source detection algorithms (e.g., celldetect).

Furthermore, it might come as a surprise to many astronomers, but this is also the rationale behind the widely-used wavelet-based source detection algorithm, wavdetect. The Mexican Hat wavelet used with it has a central positive bump, surrounded by a negative annular moat, which is a dead ringer for the source and background regions used here. The difference is that the source intensity is not deduced from the wavelet correlations and the signal-to-noise ratio ( s/sigma_s ) is not used to determine source significance, but rather extensive simulations are used to calibrate it.

Did they, or didn’t they?

vlk — Tue, 20 May 2008 04:10:23 +0000

Earlier this year, Peter Edmonds showed me a press release that the Chandra folks were, at the time, considering putting out describing the possible identification of a Type Ia Supernova progenitor. What appeared to be an accreting white dwarf binary system could be discerned in 4-year old observations, coincident with the location of a supernova that went off in November 2007 (SN2007on). An amazing discovery, but there is a hitch.

And it is a statistical hitch, and involves two otherwise highly reliable and oft used methods giving contradictory answers at nearly the same significance level! Does this mean that the chances are actually 50-50? Really, we need a bona fide statistician to take a look and point out the errors of our ways..

The first time around, Voss & Nelemans (arXiv:0802.2082) looked at how many X-ray sources there were around the candidate progenitor of SN2007on (they also looked at 4 more galaxies that hosted Type Ia SNe and that had X-ray data taken prior to the event, but didn’t find any other candidates), and estimated the probability of chance coincidence with the optical position. When you expect 2.2 X-ray sources/arcmin² near the optical source, the probability of finding one within 1.3 arcsec is tiny, and in fact is around 0.3%. This result has since been reported in Nature.

However, Roelofs et al. (arXiv:0802.2097) went about getting better optical positions and doing better bore-sighting, and as a result, they measured the the X-ray position accurately and also carried out Monte Carlo simulations to estimate the error on the measured location. And they concluded that the actual separation, given the measurement error in the location, is too large to be a chance coincidence, 1.18±0.27 arcsec. The probability ~~that the two locations are the same~~ of finding offsets in the observed range is ~1% [see Tom's clarifying comment below].

Well now, ain’t that a nice pickle?

To recap: there are so few X-ray sources in the vicinity of the supernova that anything close to its optical position cannot be a coincidence, BUT, the measured error in the position of the X-ray source is not copacetic with the optical position. So the question for statisticians now: which argument do you believe? Or is there a way to reconcile these two calculations?

Oh, and just to complicate matters, the X-ray source that was present 4 years ago had disappeared when looked for in December, as one would expect if it was indeed the progenitor. But on the other hand, a lot of things can happen in 4 years, even with astronomical sources, so that doesn’t really confirm a physical link.