When you observed zero counts, you didn’t not observe any counts
Dong-Woo, who has been playing with BEHR, noticed that the confidence bounds quoted on the source intensities seem to be unchanged when the source counts are zero, regardless of what the background counts are set to. That is, p(s|NS,NB) is invariant when NS=0, for any value of NB. This seems a bit odd, because [naively] one expects that as NB increases, it should/ought to get more and more likely that s gets closer to 0.
Suppose you compute the posterior probability distribution of the intensity of a source, s, when the data include counts in a source region (NS) and counts in a background region (NB). When NS=0, i.e., no counts are observed in the source region,
p(s|NS=0, NB) = (1+b)a/Gamma(a) * sa-1 * e-s*(1+b),
where a,b are the parameters of a gamma prior.
Why does NB have no effect? Because when you have zero counts, the entire effect of the background is going towards evaluating how good the actual chosen model is (so it is become a model comparison problem, not a parameter estimation one), and not into estimating the parameter of interest, the source intensity. That is, into the normalization factor of the probability distribution, p(NS,NB). Those parts that depend on NB cancel out when the expression for p(s|NS,NB) is written out because the shape is independent of NB and the pdf must integrate to 1.
No doubt this is obvious, but I hadn’t noticed it before.
PS: Also shows why upper limits should not be identified with upper confidence bounds.
hlee:
Without knowing the physics of choosing gamma distribution, my question may look nonsensical but I wonder if it’s possible to build a mixture model, r*p(s|N_s=0,N_b)+(1-r)*p(s|N_s>0,N_b). The first component comes from other type of distribution, not gamma. I believe astronomers know how to infer r. If it’s a single case and not sure N_s=0 or N_s>0, you can select between two (r=1 or 0) based on model selection criteria or bayes factor. Then you could find proper upper limits based on post model selection inference.
09-26-2007, 11:01 amhlee:
Does the response imply the posterior distribution is improper? Does it mean the posterior distribution is misspecified? Then, instead of mixture, looking for other distributions for a confidence bound. The point of mixture is to keep the current posterior when N_s>0 but to add an additional component for N_s=0 to avoid the behavior you’ve described. r could be an indicator, I(N_s=0). I admit that I didn’t know N_s is deterministic. I thought you only observe N, hardly separable in two, N_s and N_b, in a deterministic way.
09-26-2007, 6:12 pmhlee:
Well, then your posterior is built on an open set. No reason for considering the boundary.
09-27-2007, 11:54 amhlee:
If θ is the parameter of your interest, {θ:θ>0} is an open set, which specifies the parameter space. Openness and closeness also specifies the data space. When I read your posting, my impression was at N_s=0 (the boundary), the posterior behaves in an unexpected way, which led me to suggest a mixture model to resolve the strange behavior. My specifying distribution properly implies that the distribution behave properly in the data/parameter space, including N_s=0. If the boundary N_s=0 causing troubles, then keep the current posterior for N_s>0 and add another component from different family for N_s=0 (the boundary). [Checking the validity of this mixture model is another topic.]
09-27-2007, 7:05 pmI’d rather point a well cited paper, to indicate what I meant by misspecified.
Maximum Likelihood Estimation of Misspecified Models by H. White (1982) in Econometrica. References therein are quite classical.
Mixture models and mixing are different topics, I guess. I only know a bit of mixture models. I hope mixing didn’t occur to you.
Paul B:
What is the actual model you are both talking about? Without knowing the details, it sounds like a similar situation to the Banff model, discussed on pg15 of:
09-29-2007, 8:29 pmhttp://newton.hep.upenn.edu/~heinrich/birs/challenge.pdf
Is that of interest?
vlk:
Paul, yes, the behavior described in section 8.3 in that BIRS writeup is exactly what I wrote about above. There are two items in the Heinrich draft that I do not quite understand though — first, why the parenthetical insistence on 0 background events? and second, what does he mean by “absolute separation” between source and background?
09-29-2007, 10:30 pmhlee:
I’m ready to take any blames and admit my ignorance (I only can propose approaches limited to my knowledge and readings from the slog). I wonder if there’s a Bayesian counter part of Quantile Regression. Such model could give the upper limit at N_s=0 by getting the corresponding quantiles.
[After talking to Vinay] I misunderstood the objectivity of Vinay’s question. But, I hope Poisson Quantiles may assist low count data analysis.
10-03-2007, 3:37 pm