Is 3sigma the same as 3*1sigma?
Sometime early this year, Jeremy Drake asked this innocuous sounding question in the context of determining the bounds on the amplitude of an absorption line: Is the 3sigma error bar the same length as 3 times the 1sigma error bar?
In other words, if he measured the 99.7% confidence range for his model parameter, would it always be thrice as large as the nominal 1sigma confidence range? The answer is complicated, and depends on who you ask: Frequentists will say “yes, of course!”, Likelihoodists will say “Maybe, yeah, er, depends”, and Bayesians will say “sigma? what’s that?” So I think it would be useful to expound a bit more on this to astronomers, whose mental processes are generally Bayesian but whose computational tools are mostly Frequentist.
In the Frequentist paradigm, one fits a parameterized model to numerous datasets, and computes the errors on the fitted parameters by looking at the ensemble of the parameter values. That is, if you have one dataset Yi(Xi) {i=1..Nbin} and fit a model f(A) with parameters A (usually by minimizing the chisquare (χ2) deviations or something of that sort) you will get one set of best-fit parameters, Â. It is important to note that in this paradigm, the concept of determining an error on  does not exist!. Errors are only determined by fitting f(A) to a series of datasets Yij(Xi), {i=1..Nbin, j=1..Nobs}, and deriving Âj for each of them, and computing sigmaA2=variance( {Âj} ). By the central limit theorem, as Nobs increases, the distribution of { Âj } will tend closer to a Gaussian. So, in this case, 3*sigmaA will asymptotically become equal to 3sigmaA.
However, Astronomers are not afforded this luxury of having multiple observations. With dreary regularity, we have to make do with Nobs=1. In that case, one takes recourse in the behavior of the χ2, and in particular, the likelihood of obtaining a certain χ2 for the observed data given the model. If we assume that the measurement errors are Normally distributed, and that we are looking in that part of the parameter space close to the best-fit solution, then it can be shown that the χ2 varies as a parabolic function of the parameter values. The 1sigma error on the parameter (not necessarily from a linear model; see Cash 1979, for example) is computed by determining the parameter values at which the χ2 function (computed by stepping through the values of one parameter and varying all the other parameters to minimize the χ2 at that point) increases by 1. Similarly, the 3sigma error is computed by checking at what values the χ2 increases by 9. As long as the assumptions of local linearity and the Gaussian measurement error hold (i.e., the χ2 function is parabolic), then we will have 3*1sigma=3sigma, otherwise not.
Bayesians on the other hand compute the marginalized posterior probability density for each parameter and calculate the bounds on the parameter via credible regions that enclose a given percentage of the area under the curve. There is no set recipe for specifying how the bounds are calculated: it could be one-sided, centered on the median, or include the mode and the highest posterior densities. (The last leads to the smallest possible interval.) The percentage levels for which the intervals are calculated are usually matched to the percentages that Gaussian standard deviations provide, e.g., 68% == 1sigma, 99.7% == 3sigma, etc. Thus, when the 68% credible range is mentioned, it could also be referred to as the “Gaussian-equivalent 1sigma region”. However, because the intervals can be defined in many different ways, they do not necessarily match the 1sigma error bar, and for that matter are often asymmetrical. Given that, it is quite unreasonable to expect that 3*1sigma, even on one side, would match the Gaussian-equivalent 3sigma. IF the pdf is Gaussian, AND we choose to use HPD (highest posterior density) intervals, THEN the 99.7% interval would be thrice as long as the 68% interval.
hlee:
There were different viewpoints:
Instead of maximum likelihood estimators, they described least square estimators via linear approximation (or linearization, where Hessian is mentioned). Most optimization (fitting) problems adopt some level of linearlization. Here, 1sigma error is the error of the parameter estimator, not the model (i.e. error quantities related to measurement errors). That χ^2 function does not explain the error of the parameter estimator unless the model is linear. It does explain residuals by the fitted model.
Some simulation studies to compare bootstrapping and using their χ^2 function for spectrum line fitting can be done. Maybe, some non linear model fitting and its analytic/numerical solution to estimates of variance might be inserted. vlk’s quote from NumRec confirms it’s worthwhile to try.
02-12-2007, 1:18 amhlee:
When the model parameters are estimated via linear approximation, the χ^2 function defined in the Numerical Recipes may not be χ^2 distribution. However, the name imposes an idea that the sum of errors after fitting always follows χ^2 distribution upon which statistical inference, such as getting confidence intervals, is performed. Unless the fitted model is linear in parameters, the name, χ^2 function needs to be changed to prevent any misleading. A new name will help to find breakdowns of the rule, mσ ~ m*1σ, where m is a positive number, or degrees of the approximation of the rule.
02-12-2007, 11:16 amvlk:
Couple of points worth clarifying here:
1, it is not that the parameters be estimated via linear approximation, but rather that the χ^2 be estimated via linear deviations of the function around the best-fit, and
2, the issue is how far away from the best-fit is the linear approximation valid, whether the 1sigma error bar that comes out of this process is reliable, overestimated, underestimated, what?
btw, while the Numerical Recipes gives a handy summary of the process, it wasn’t invented by them. And they do say (NumRec in C, p695, 2nd Ed) that the one must use “Monte Carlo simulations or detailed analytic calculation in determining which contour Delta χ^2 is the correct one for [the] desired confidence level.”
02-12-2007, 12:05 pm