URL: http://heawww.harvard.edu/AstroStat/CAS2010
]]>The CaliforniaBostonSmithsonian Astrostatistics Collaboration plans to host a miniworkshop on Computational Astrostatistics. With the advent of new missions like the Solar Dynamic Observatory (SDO), Panoramic Survey and Rapid Response (PanSTARRS) and Large Synoptic Survey (LSST), astronomical data collection is fast outpacing our capacity to analyze them. Astrostatistical effort has generally focused on principled analysis of individual observations, on one or a few sources at a time. But the new era of data intensive observational astronomy forces us to consider combining multiple datasets and infer parameters that are common to entire populations. Many astronomers really want to use every data point and even nondetections, but this becomes problematic for many statistical techniques.
The goal of the Workshop is to explore new problems in Astronomical data analysis that arise from data complexity. Our focus is on problems that have generally been considered intractable due to insufficient computational power or inefficient algorithms, but are now becoming tractable. Examples of such problems include: accounting for uncertainties in instrument calibration; classification, regression, and density estimations of massive data sets that may be truncated and contaminated with measurement errors and outliers; and designing statistical emulators to efficiently approximate the output from complex astrophysical computer models and simulations, thus making statistical inference on them tractable. We aim to present some issues to the statisticians and clarify difficulties with the currently used methodologies, e.g. MCMC methods. The Workshop will consist of review talks on current Statistical methods by Statisticians, descriptions of data analysis issues by astronomers, and open discussions between Astronomers and Statisticians. We hope to define a path for development of new algorithms that target specific issues, designed to help with applications to SDO, PanSTARRS, LSST, and other survey data.
We hope you will be able to attend the workshop and present a brief talk on the scope of the data analysis problem that you confront in your project. The workshop will have presentations in the morning sessions, followed by a discussion session in the afternoons of both days.
The biggest challenge for a statistician to use astronomical data is the lack of mercy for nonspecialists’ accessing data including format, quantification, and qualification^{[2]} ; and data analysis systems. IDL is costly although it is used in many disciplines and other tools in astronomy are hardly utilized for different projects.^{[3]} In that regards, I welcome astronomers using python to break such exclusiveness in astronomical data analysis systems.
Even if data and software issues are resolved, there’s another barrier to climb. Validation. If you have a catalog, you’ll see variables of measures, and their errors typically reflecting the size of PSF and its convolution to those metrics. If a model of gaussian assumption applied, in order to tabulate power law index, King’s, Petrosian’s, or de Vaucouleurs’ profile index, and numerous metrics, I often fail to find any validation of gaussian assumptions, gaussian residuals, spectral and profile models, outliers, and optimal binning. Even if a data set is publicly available, I also fail to find how to read in raw data, what factors must be considered, and what can be discarded because of unexpected contamination occurred like cosmic rays and charge over flows. How would I validate raw data that are read into a data analysis system is correctly processed to match values in catalogs? How would I know all entries in catalog are ready for further scientific data analysis? Are those sources real? Is pvalue appropriately computed?
I posted an article about Chernoff faces applied to Capella observations from Chandra. Astronomers already processed the raw data and published a catalog of Xray spectra. Therefore, I believe that the information in the catalog is validated and ready to be used for scientific data analysis. I heard that repeated Capella observation is for the calibration. Generally speaking, in other fields, targets for calibration are almost time invariant and exhibit consistency. If Capella is a same star over the 10 years, the faces in my post should look almost same, within measurement error; but as you saw, it was not consistent at all. Those faces look like observations were made toward different objects. So far I fail to find any validation efforts, explaining why certain ObsIDs of Capella look different than the rest. Are they real Capella? Or can I use this inconsistent facial expression as an evidence that Chandra calibration at that time is inappropriate? Or can I conclude that Capella was a wrong choice for calibration?
Due to the lack of quantification procedure description from the raw data to the catalog, what I decided to do was accessing the raw data and data processing on my own to crosscheck the validity in the catalog entries. The benefit of this effort is that I can easily manipulate data for further statistical inference. Although reading and processing raw data may sound easy, I came across another problem, lack of documentation for nonspecialists to perform the task.
A while ago, I talked about read.table() in R. There are slight different commands and options but without much hurdle, one can read in ascii data in various styles easily with read.table() for exploratory data analysis and confirmatory data analysis with R. From my understanding, statisticians do not spend much time on reading in data nor collecting them. We are interested in methodology to extract information of the population based on sample. While the focus is methodology, all the frustrations with astronomical data analysis softwares occur prior to investigating the best method. The level of frustration reached to the extend of terminating my eagerness for more investigation about inference tools.
In order to assess those Capella observations, thanks to its onsite help, I evoke ciao. Beforehand, I’d like to disclaim that I exemplify ciao to illustrate the culture difference that I experienced as a statistician. It was used to discuss why I think that astronomical data analysis systems are short of documentations and why that astronomical data processing procedures are lack of validation. I must say that I confront very similar problems when I tried to learn astronomical packages such as IRAF and AIPS. Ciao happened to be at handy when writing this post.
In order to understand Xray data, not only image data files, one also needs effective area (arf), redistribution matrix (rmf), and point spread function (psf). These files are called by calibration data files. If the package was developed for general users, like read.table() I expect there should be a homogenized/centralized data including calibration data reading function with options. Instead, there were various kinds of functions one can use to read in data but the description was not enough to know which one is doing what. What is the functionality of these commands? Which one only stores names of data file? Which one reconfigures the raw data reflecting up to date calibration file? Not knowing complete data structures and classes within ciao, not getting the exact functionality of these data reading functions from ahelp, I was not sure the log likelihood that I computed is appropriate or not.
For example, there are five different ways to associate an arf: read_arf(), load_arf(), set_arf(), get_arf(), and unpack_arf() from ciao. Except unpack_arf(), I couldn’t understand the difference among these functions for accessing an arf^{[4]} Other softwares including XSPEC that I use, in general, have a single function with options to execute different level of reading in data. Ciao has an extensive web documentation without a tutorial (see my post). So I read all ahelp “commands” a few times. But I still couldn’t decide which one to use for my work to read in arfs and rmfs (I happened to have many calibration data files).
arf  rmf  psf  pha  data  

get  get_arf  get_rmf  get_psf  get_pha  get_data 
set  set_arf  set_rmf  set_psf  set_pha  set_data 
unpack  unpack_arf  unpack_rmf  unpack_psf  unpack_pha  unpack_data 
load  load_arf  load_rmf  load_psf  load_pha  load_data 
read  read_arf  read_rmf  read_psf  read_pha  read_data 
[Note that above links may not work since ciao documentation website evolves quickly. Some might be routed to different links so please, check this website for other data reading commands: cxc.harvard.edu/sherpa/ahelp/index_alphabet.html].
So, I decide to seek for a help through cxc help desk several months back. Their answers are very reliable and prompt. My question was “what are the difference among read_xxx(), load_xxx(), set_xxx(), get_xxx(), and unpack_xxx(), where xxx can be data, arf, rmf, and psf?” The answer to this question was that
You can find detailed explanations for these Sherpa commands in the “ahelp” pages of the Sherpa website:
http://cxc.harvard.edu/sherpa/ahelp/index_alphabet.html
This is a good answer but a big cultural shock to a statistician. It’s like having an answer like “check http://www.rproject.org/search.html and http://cran.rproject.org/doc/FAQ/RFAQ.html” for IDL users to find out the difference between read.table() and scan(). Probably, for astronomers, all above various data reading commands are self explanatory like R having read.table(), read.csv(), and scan(). Disappointingly, this answer was not I was looking for.
Well, thanks to this embezzlement, hesitation, and some skepticism, I couldn’t move to the next step of implementing fitting methods. At the beginning, I was optimistic when I found out that Ciao 4.0 and up is python compatible. I thought I could do things more in statistically rigorous ways since I can fake spectra to validate my fitting methods. I was thinking about modifying the indispensable chisquare method that is used twice for point estimation and hypothesis testing that introduce bias (a link made to a posting). My goal was make it less biased and robust, less sensitive iid Gaussian residual assumptions. Against my high expectation, I became frustrated at the first step, reading and playing with data to get a better sense and to develop a quick intuition. I couldn’t even make a baby step to my goal. I’m not sure if it a good thing or not, but I haven’t been completely discouraged. Also, time helps gradually to overcome this culture difference, the lack of documentation.
What happens in general is that, if a predecessor says, use “set_arf(),” then the apprentice will use “set_arf()” without doubts. If you begin learning on your own purely relying on documentations, I guess at some point you have to make a choice. One can make a lucky guess and move forward quickly. Sometimes, one can land on miserable situation because one is not sure about his/her choice and one cannot trust the features appeared after these processing. I guess it is natural to feel curiosity about what each of these commands is doing to your data and what information is carried over to the next commands in analysis procedures. It seems righteous to know what command is best for the particular data processing and statistical inference given the data. What I found is that such comparison across commands is missing in documentations. This is why I thought astronomical data analysis systems are short of mercy for nonspecialists.
Another thing I observed is that there seems no documentation nor standard procedure to create the repeatable data analysis results. My observation of astronomers says that with the same raw data, the results by scientist A and B are different (even beyond statistical margins). There are experts and they have knowledge to explain why results are different on the same raw data. However, not every one can have luxury of consulting those few experts. I cannot understand such exclusiveness instead of standardizing the procedures through validations. I even saw that the data that A analyzed some years back can be different from this year’s when he/she writes a new proposal. I think that the time for recreating the data processing and inference procedure to explain/justify/validate the different results or to cover/narrow the gap could have not been wasted if there are standard procedures and its documentation. This is purely a statistician’s thought. As the comment in where is ciao X?^{[5]} not every data analysis system has to have similar design and goals.
Getting lost while figuring out basics (handling, arf, rmf, psf, and numerous case by case corrections) prior to applying any simple statistics has been my biggest obstacle in learning astronomy. The lack of documenting validation methods often brings me frustration. I wonder if there’s any astronomers who lost in learning statistics via R, minitab, SAS, MATLAB, python, etc. As discussed in where is ciao X? I wish there is a centralized tutorial that offers basics, like how to read in data, how to do manipulate datum vector and matrix, how to do arithmetics and error propagation adequately not violating assumptions in statistics (I don’t like the fact that the point estimate of background level is subtracted from observed counts, random variable when the distribution does not permit such scale parameter shifting), how to handle and manipulate fits format files from Chandra for various statistical analysis, how to do basic image analysis, how to do basic spectral analysis, and so on with references^{[6]}
Having 13 Xray line and/or continuum ratios, a typical data display would be the 13 choose 2 combination of scatter plots as follows. Note that the upper left panels with three colors are drawn for the classification purpose (red: AL Lac, blue: IM Peg, green:Capella) while lower right ones are discolored for the clustering analysis purpose. These scatter plots are essential to exploratory data analysis but they do not convey information efficiently with these many scatter plots. In astronomical journals, thanks to astronomers’ a priori knowledge, a fewer pairs of important variables are selected and displayed to reduce the visualization complexity dramatically. Unfortunately, I cannot select physically important variables only.

I am not a wellknowledged astronomer but believe in reducing dimensionality by the research objective. The goal is set from asking questions like “what do you want from this multivariate data set?” classification (classification rule/regression model that separates three stars, Capella, AL Lac, and IM Peg), clustering (are three stars naturally clustered into three groups? Or are there different number of clusters even if they are not well visible from above scatter plots?), hypothesis testing (are they same type of stars or different?), point estimation and its confidence interval (means and their error bars), and variable selection (or dimension reduction). So far no statistical question is well defined (it can be good thing for new discoveries). Prior to any confirmatory data analysis, we’d better find a way to display this multidimensional data efficiently. I thought parallel coordinates serve the purpose well but surprisingly, it was never discussed in astronomical literature, at least it didn’t appear in ADS.
Each 13 variable was either normalized (left) or standardized (right). The parallel coordinate plot looks both simpler and more informative. Capella observations occupy relatively separable space than the other stars. It is easy to distinguish that one Capella observation is an obvious outlier to the rest which is hardly seen from scatter plots. It is clear that discriminant analysis or classical support vector machine type classification methods cannot separate AL Lac and IM Pec. Clustering based on distance measures of dissimilarity also cannot be applied in order to see a natural grouping of these two stars whereas Capella observations form its own cluster. To my opinion, parallel coordinates provide more information about multidimensional data (dim>3) in a simpler way than scatter plots of multivariate data. It naturally shows highly correlated variables within the same star observations or across all target stars. This insight from visualization is a key to devising methods of variable selection or reducing dimensionality in the data set.
Personal opinion is that not having an efficient and informative visualization tool for visualizing complex high resolution spectra in many detailed metrics, smoothed bivariate (trivariate at most) information such as hardness ratios and quantiles are utilized in displaying Xray spectral data, instead. I’m not saying that the parallel coordinates are the ultimate answer to visualizing multivariate data but I’d like to emphasize that this method is more informative, intuitive and simple to understand the structure of relatively high dimensional data cloud.
Parallel coordinates has a long history. The earliest discussion I found was made in 1880ies. It became popular by Alfred Inselberg and gained recognition among statisticians by George Wegman (1990, Hyperdimensional Data Analysis Using Parallel Coordinates). Colorful images of the Sun, stars, galaxies, and their corona, interstellar gas, and jets are the eye catchers. I hope that data visualization tools gain equal spot lights since they summarize data and deliver lots of information. If images are well decorated cakes, then these tools from EDA are sophisticated and well baked cookies.
—————— [Added]
According to
[arxiv:0906.3979] The Golden Age of Statistical Graphics
Michael Friendly (2008)
Statistical Science, Vol. 23, No. 4, pp. 502535
it is 1885. Not knowing French – if I knew I’d like to read Gauss’ paper immediately prior to anything – I don’t know what the reference is about.
]]>After meeting Prof. Herman Chernoff unexpectedly – I didn’t know he is Professor Emeritus at Harvard – the urge revived but I didn’t have data, still then. Alas, another absent mindedness: I don’t understand why I didn’t realize that I already have the data, XAtlas for trying Chernoff faces until today. Data and its full description is found from the XAtlas website (click). For Chernoff face, references suggested in Wiki:Chernoff face are good. I believe some folks are already familiar with Chernoff faces from a New York Times article last year, listed in Wiki (or a subset characterized by baseball lovers?).
Capella is a Xray bright star observed multiple times for Chandra calibration. I listed 16 ObsIDs in the figures below at each face, among 18+ Capella observations (Last time when I checked Chandra Data Archive, 18 Capella observations were available). These 16 are high resolution observations from which various metrics like interesting line ratios and line to continuum ratios can be extracted. I was told that optically it’s hard to find any evidence that Capella experienced catastrophic changes during the Chandra mission (about 10 years in orbit) but the story in Xray can’t be very different. In a dismally short time period (10 years for a star is a flash or less), Capella could have revealed short time scale high energy activities via Chandra. I just wanted to illustrate that Chernoff faces could help visualizing such changes or any peculiarities through interpretation friendly facial expressions (Studies have confirmed babies’ ability in facial expression recognitions). So, what do you think? Do faces look similar/different to you? Can you offer me astronomical reasons for why a certain face (ObsID) is different from the rest?
p.s. In order to draw these Chernoff faces, check descriptions of these R functions, faces() (yields the left figure) or faces2() (yields the right figure) by clicking on the function of your interest. There are other variations and other data analysis systems offer different fashioned tools for drawing Chernoff faces to explore multivariate data. Welcome any requests for plots in pdf. These jpeg files look too coarse on my screen.
p.p.s. Variables used for these faces are line ratios and line to continuum ratios, and the order of these input variables could change countenance but impressions from faces will not change (a face with distinctive shapes will look different than other faces even after the order of metrics/variables is scrambled or using different Chernoff face illustration tools). Mapping, say from an astronomical metric to the length of lips was not studied in this post.
p.p.p.s. Some data points are statistical outliers, not sure about how to explain strange numbers (unrealistic values for line ratios). I hope astronomers can help me to interpret those peculiar numbers in line/continuum ratios. My role is to show that statistics can motivate astronomers for new discoveries and to offer different graphics tools for enhancing visualization. I hope these faces motivate some astronomers to look into Capella in XAtlas (and beyond) in details with different spectacles, and find out the reasons for different facial expressions in Capella Xray observations. Particularly, ObsID 1199 is most questionable to me.
]]>When I was learning statistics, I never confronted such huge degrees of freedom. Well, given the facts that only a small amount of time is used for learning the chisquare goodnessoffit test, that the chisquare distribution is a subset of gamma distribution, and that statisticians do not handle a hundred of thousands (there are more low count spectra but I’ll discuss why I chose this big number later) of photons from Xray telescopes, almost surely no statistician would confront such huge degrees of freedom.
Degrees of freedom in spectral fitting are combined results of binning (or grouping into n classes) and the number of free parameters (p), i.e. np1. Those parameters of interest, targets to be optimized or to be sought for solutions are from physical source models, which are determined by law of physics. Nothing to be discussed from the statistical point of view about these source models except the model selection and assessment side, which seems to be almost unexplored area. On the other hand, I’d like to know more about binning and subsequent degrees of freedom.
A few binning schemes in spectral analysis that I often see are each bin having more than 25 counts (the same notion of 30 in statistics for CLT or the last number in a ttable) or counts in each bin satisfying a certain signal to noise ratio S/N level. For the latter, it is equivalent that sqrt(expected counts) is larger than the given S/N level since photon counts are Poisson distributed. There are more sophisticated adaptive binning strategies but I haven’t found mathematical, statistical, nor computational algorithmic justifications for those. They look empirical procedures to me that are discovered after many trials and errors on particular types of spectra (I often become suspicious if I can reproduce the same goodness of fit results with the same ObsIDs as reported in those publications). The point is that either simple or complex, at the end, if someone has a data file with large number of photons, n is generally larger than observations with sparse photons. This is the reason I happen to see inconceivable d.f.s to a statistician from some papers, like 4754.
First, the chisquare goodness of fit test was designed for agricultural data (or biology considering Pearson’s eugenics) where the sample size is not a scale of scores of thousands. Please, note that bin in astronomy is called cell (class, interval, partition) in statistical papers and books showing applications of chisquare goodness fit tests.
I also like to point out that the chisquare goodness of fit test is different from the chisquare minimization even if they share the same equation. The former is for hypothesis testing and the latter is for optimization (best fit solution). Using the same data for optimization and testing introduces bias. That’s one of the reasons why with large number of data points, cross validation techniques are employed in statistics and machine learning^{[1]}. Since I consider binning as smoothing, the optimal number of bins and their size depends on data quality and source model property as is done in kernel density estimation or imminently various versions of chisquare tests or distance based nonparametric tests (KS test, for example).
Although published many decades ago, you might want to check this paper out to get a proper rule of thumb for the number of bins:
“On the choice of the number of class intervals in the application of the chi square test” (JSTOR link) by Mann and Wald in The Annals of Mathematical Statistics, Vol. 13, No. 3 (Sep., 1942), pp. 306317 where they showed that the number of classes is proportional to N^(2/5) (The underlying idea about the chisquare goodness of fit tests, detailed derivation, and exact equation about the number of classes is given in detail) and this is the reason why I chose a spectrum of 10^5 photons at the beginning. By ignoring other factors in the equation, 10^5 counts roughly yields 100 bins. About 4000 bins implies more than a billion photons, which seems a unthinkable number in Xray spectral analysis. Furthermore, many reports said Mann and Wald’s criterion results in too many bins and loss of powers. So, n is subject to be smaller than 100 for 10^5 photons.
The other issue with statistical analysis on Xray spectra is that although photons in each channel/bin can be treated as independent sample but the expected numbers of photons across bins are related via physical source model or so called link function borrowed from generalized linear model. However, well studied link functions in statistics do not match source models in high energy astrophysics. Typically, source models are not analytical. They are nonlinear, numerical, tabulated, or black box type that are incompatible with current link functions in generalized linear model that is a well developed, diverse, and robust subject in statistics for inference problems. Therefore, binning data and chisquare minimization seems to be an only strategy for statistical inference about parameters in source models so far (for some “specific” statistical or physical models, this is not true, which is not a topic of this discussion). Mann and Wald’s method for class size assumes equiprobable bins whereas channel or bin probabilities in astronomy would not satisfy the condition. The probability vector of multinomial distribution depends on binning, detector sensitivity, and source model instead of the equiprobable constraint from statistics. Well, it is hard to device an purely statistically optimal binning/grouping method for Xray spectral analysis.
Instead of individual group/bin dependent smoothing (S/N>3 grouping, for example), I, nevertheless, wish for developing binning/grouping schemes based on total sample size N particularly when N is large. I’m afraid that with the current chisquare test embedded in data analysis packages, the power of a chisquare statistic is so small and one will always have a good reduced chisquare value (astronomers’ simple model assessment tool: the measure of chisquare statistic divided by degrees of freedom and its expected value is one. If the reduced chisquare criterion is close to one, then the chosen source model and solution for parameters is considered to be best fit model and value). The fundamental idea of suitable number of bins is equivalent to optimal bandwidth problems in kernel density estimation, of which objective is accentuating the information via smoothing; therefore, methodology developed in the field of kernel density estimation may suggest how to bin/group the spectrum while preserving the most of information and increasing the efficiency. A modified strategy for binning and applying the chisquare test statistic for assessing model adequacy should be conceived instead of reporting thousands of degrees of freedom.
I think I must quit before getting too bored. Only I’d like to mention quite interesting papers that cited Mann and Wald (1942) and explored the chi square goodness of fit including Johnson’s A Bayesian chisquare test for GoodnessofFit (a link is made to the arxiv pdf file) which might provide more charm to astronomers who like to modify their chisquare methods in a Bayesian way. A chapter “On the Use and Misuse of ChiSquare” (link to google book excerpt) by KL Delucchi in A Handbook for Data Analysis in the Behavioral Sciences (1993) reads quite intriguing although the discussion is a reminder for behavior scientists.
Lastly, I’m very sure that astronomers explored properties of the chisquare statistic and chisquare type tests with their data sets. I admit that I didn’t make an expedition for such works since those are few needles in a mound of haystack. I’ll be very delighted to see an astronomers’ version of “use and misuse of chisquare,” a statistical account for whether the chisquare test with huge degrees of freedom is powerful enough, or any advice on that matter will be very much appreciated.
SUBJECT(bayes) CONTEXT(sherpa)
SYNOPSIS
A Bayesian maximum likelihood function.
Maximum likelihood function is common for both Bayesian and frequentist methods. I don’t know get the point why “Bayesian” is particularly added with “maximum likelihood function.”
DESCRIPTION
(snip)
We can relate this likelihood to the Bayesian posterior density for S(i) and B(i)
using Bayes’ Theorem:p[S(i),B(i)  N(i,S)] = p[S(i)B(i)] * p[B(i)] * p[N(i,S)  S(i),B(i)] / p[D] .
The factor p[S(i)B(i)] is the Bayesian prior probability for the source model
amplitude, which is assumed to be constant, and p[D] is an ignorable normalization
constant. The prior probability p[B(i)] is treated differently; we can specify it
using the posterior probability for B(i) offsource:p[B(i)] = [ A (A B(i))^N(i,B) / N(i,B)! ] * exp[A B(i)] ,
where A is an “area” factor that rescales the number of predicted background
counts B(i) to the offsource region.IMPORTANT: this formula is derived assuming that the background is constant as a
function of spatial area, time, etc. If the background is not constant, the Bayes
function should not be used.
Why not? If I rephrase it, what it said is that B(i) is a constant. Then why one bothers to write p[B(i)], a probability density of a constant? The statement sounds self contradictory to me. I guess B(i) is a constant parameter. It would be suitable to write that Background is homogeneous and the Background is describable with homogeneous Poisson process if the above pdf is a correct model for Background. Also, a slight notation change is required. Assuming the Poisson process, we could estimate the background rate (constant parameter) and its density p[B(i)], and this estimate is a constant as stated for p[S(i)B(i)], a prior probability for the constant source model amplitude.
I think the reason for “Bayes should not used” is that the current sherpa is not capable of executing hierarchical modeling. Nevertheless, I believe one can script the MCMC methodologies with SLang/Python to be aggregated with existing sherpa tools to incorporate a possible space dependent density, p[B(i,x,y)]. I was told that currently a constant background regardless of locations and background subtraction is commonly practiced.
To take into account all possible values of B(i), we integrate, or marginalize,
the posterior density p[S(i),B(i)  N(i,S)] over all allowed values of B(i):p[S(i)  N(i,S)] = (integral)_0^(infinity) p[S(i),B(i)  N(i,S)] dB(i) .
For the constant background case, this integral may be done analytically. We do
not show the final result here; see Loredo. The function log p[S(i)N(i,S)] is
minimized to find the bestfit value of S(i). The magnitude of this function
depends upon the number of bins included in the fit and the values of the data
themselves. Hence one cannot analytically assign a `goodnessoffit’ measure to a
given value of this function. Such a measure can, in principle, be computed by
performing Monte Carlo simulations. One would repeatedly sample new datasets from
the bestfit model, and fit them, and note where the observed function minimum
lies within the derived distribution of minima. (The ability to perform Monte
Carlo simulations is a feature that will be included in a future version of
Sherpa.)Note on Background Subtraction
Bayesian computation means one way or the other that one is able to get posterior distributions in the presence of various parameters regardless of their kinds: source or background. I wonder why there’s a discrimination such that source parameter has uncertainty whereas the background is constant and is subtracted (yet marginalization is emulated by subtracting different background counts with corresponding weights). It fell awkward to me. Background counts as well as source counts are Poisson random. I would like to know what justifies constant background while one uses probabilistic approaches via Bayesian methods. I would like to know why the mixture model approach – a mixture of source model and background model with marginalization over background by treating B(i) as a nuisance parameter – has not been tried. By casting eye sights broadly on Bayesian modeling methods and basics of probability, more robustly estimating the source model and their parameters is tractable without subtracting background prior to fitting a source model.
The background should not be subtracted from the data when this function is used
The background only needs to be specified, as in this example:
(snip)EXAMPLES
EXAMPLE 1
Specify the fitting statistic and then confirm it has been set. The method is then
changed from “LevenbergMarquardt” (the default), since this statistic does not
work with that algorithm.sherpa> STATISTIC BAYES
sherpa> SHOW STATISTIC
Statistic: Bayes
sherpa> METHOD POWELL
(snip)
I would like to know why it’s not working with LevenbergMarquardt (LM) but working with Powell. Any references that explain why LM does not work with Bayes?
I do look forward your comments and references, particularly reasons for Bayesian maximum likelihood function and Bugs with LM. Also, I look forward to see off the norm approaches such as modeling fully in Bayesian ways (like van Dyk et al. 2001, yet I see its application rarely) or marginalizing Background without subtraction but simultaneously fitting the source model. There are plenty of rooms to be improved in source model fitting under contamination and distortion of xray photon incidents through space, telescope, and signal transmission.
]]>XMMNewton talks again loud and clear
23 October 2008
XMMNewton, ESA’s Xray observatory, has reestablished communication contact with Earth, showing that the spacecraft is safe and fully under control. The news was confirmed this morning by the mission control team at ESA’s European Space Operations Centre (ESOC) in Darmstadt, Germany.Radio contact was reestablished on Wednesday 22 October around 18:00 Central European Summer Time (CEST). This followed an unexpected radio silence from the spacecraft which started on Saturday 18 October 2008 when XMMNewton, coming out of a nominal period of nonradio visibility along its orbit around Earth, did not succeed in sending the expected signal to Earth.
First there was Edwin Hubble (S.B. 1910, Ph.D. 1917).
Then came Arthur Compton (the “MetLab”).
Followed by Subramanya Chandrasekhar (Morton D. Hull Distinguished Service Professor of Theoretical Astrophysics).
And now, Enrico Fermi.
]]>The flux at Earth due to an atomic transition u –> l from a volume element δV at a location ɼ,
I_{ul}(ɼ) = (1/4 π) (1/d(ɼ)^{2}) A(Z,ɼ) G_{ul}(n_{e}(ɼ),T_{e}(ɼ)) n_{e}(ɼ)^{2} δV(ɼ) ,
where n_{e} is the electron density and T_{e} is the temperature of the plasma, A(Z,ɼ) are the abundance of element Z, G_{ul}(n_{e},T_{e}) is the atomic emissivity for the transition, and d is the distance to the source.
We can combine the flux from all the points in the field of view that arise from plasma at the same temperature,
I_{ul}(T_{e}) = (1/4 π) ∑_{ɼT}_{e} (1/d(ɼ)^{2}) A(Z,ɼ) G_{ul}(n_{e}(ɼ),T_{e}) n_{e}^{2}δV(ɼ) .
Assuming that A(Z,ɼ), n_{e}(ɼ) do not vary over the points in the summation,
I_{ul}(T_{e}) ≈ (1 / 4 π d^{2}) G_{ul}(n_{e},T_{e}) A(Z) n_{e}^{2} (ΔV / Δlog T_{e}) Δlog T_{e} ,
and hence the total line flux due to emission at all temperatures,
I_{ul} = ∑_{T}_{e} (1 / 4 π d^{2}) A(Z) G_{ul}(n_{e},T_{e}) DEM(T_{e}) ΔlogT_{e}
The quantity
DEM(T_{e}) = n_{e}^{2} (ΔV / Δlog T_{e})
is called the Differential Emission Measure and is a very useful summary of the temperature structure of stellar coronae. It is typically reported in units of [cm^{3}] (or [cm^{5}] if ΔV is written out as area*Δh). Sometimes it is defined as n_{e}^{2}(ΔV/ΔT) and has units [cm^{3}K^{1}].
The expression for the line flux is an instance of Fredholm’s Equation of the First Kind and the DEM(T_{e}) solution is thus unstable and subject to highfrequency oscillations. There is a whole industry that has grown up trying to derive DEMs from often highly unreliable datasets.
]]>