A book by David Freedman

A continuation from my posting, titled circumspect frequentist.

Title: Statistical Models: Theory and Practice (click for the publisher’s website)
My one line review, rather a comment several months ago was

Bias in asymptotic standard errors is not a familiar topic for astronomers

and I don’t understand why I wrote it but I think I came up this comment owing to my pursuit of modeling measurement errors occurring in astronomical researches. …Continue reading»

[MADS] Semiparametric

There were (only) four articles from ADS whose abstracts contain the word semiparametric (none in titles). Therefore, semiparametric is not exactly [MADS] but almost [MADS]. One would like to say it is virtually [MADS] or quasi [MADS]. By introducing the term and providing rare examples in astronomy, I hope this scarce term semiparametric to be used adequately against its misguidance of astronomers to inappropriate usage for statistical inference with their data. …Continue reading»

[ArXiv] Special Issue from Annals of Applied Statistics

When I was studying astronomy, during when I once became a subject for a social science survey study about life in a department where gender bias is extreme (I was only female), people often asked me how to forecast weather or how to predict future (boys often get questions related to becoming astronauts in addition to weather men and astrologists). Relating astronomy to earth science still happens. Statisticians that I met at conferences, often tried to associate my efforts on astronomical data with those of geologists and meteorologists, who often use stochastic models and spatial temporal models, dimensional extensions of models in time series. Because of this confusion between astronomy and meteorology/geology/oceanology, and the longer history of wide statistical applications found from the latter subjects (a good counter example is the least square method by Gauss but I cannot think more examples to contradict my statement that statistics is used widely among earth scientists with rich history), from time to time my attention has been paid to various applications and models in those subjects so as to find a thread for similar applications for astronomy. Although I don’t like the misconception of astronomy equal to meteorology or geoscience, those scientific fields, what so ever, share at least one commonality that statistical methods are applied to analyzing satellite data. …Continue reading»

Circumspect frequentist

The first issue of this year’s IMS bulletin has an obituary, from which the following is quoted. …Continue reading»

accessing data, easier than before but…

Someone emailed me for globular cluster data sets I used in a proceeding paper, which was about how to determine the multi-modality (multiple populations) based on well known and new information criteria without binning the luminosity functions. I spent quite time to understand the data sets with suspicious numbers of globular cluster populations. On the other hand, obtaining globular cluster data sets was easy because of available data archives such as VizieR. Most data sets in charts/tables, I acquire those data from VizieR. In order to understand science behind those data sets, I check ADS. Well, actually it happens the other way around: check scientific background first to assess whether there is room for statistics, then search for available data sets. …Continue reading»

Likelihood Ratio Technique

I wonder what Fisher, Neyman, and Pearson would say if they see “Technique” after “Likelihood Ratio” instead of “Test.” A presenter’s saying “Likelihood Ratio Technique” for source identification, I couldn’t resist checking it out not to offend founding fathers of the likelihood principle in statistics since “Technique” sounded derogatory to be attached with “Likelihood” to my ears. I thank, above all, the speaker who kindly gave me the reference about this likelihood ratio technique. …Continue reading»

Lost in Translation: Measurement Error

You would think that something like “measurement error” is a well-defined concept, and everyone knows what it means. Not so. I have so far counted at least 3 different interpretations of what it means.

Suppose you have measurements X={Xi, i=1..N} of a quantity whose true value is, say, X0. One can then compute the mean and standard deviation of the measurements, E(X) and σX. One can also infer the value of a parameter θ(X), derive the posterior probability density p(θ|X), and obtain confidence intervals on it.

So here are the different interpretations:

  1. Measurement error is σX, or the spread in the measurements. Astronomers tend to use the term in this manner.
  2. Measurement error is X0-E(X), or the “error made when you make the measurement”, essentially what is left over beyond mere statistical variations. This is how statisticians seem to use it, essentially the bias term. To quote David van Dyk

    For us it is just English. If your measurement is different from the real value. So this is not the Poisson variability of the source for effects or ARF, RMF, etc. It would disappear if you had a perfect measuring device (e.g., telescope).

  3. Measurement error is the width of p(θ|X), i.e., the measurement error of the first type propagated through the analysis. Astronomers use this too to refer to measurement error.

Who am I to say which is right? But be aware of who you may be speaking with and be sure to clarify what you mean when you use the term!

MMIX

The year 2009 is the Darwin bicentennial and the sesquicentennial of the publication of the Origin of Species, but, um, even more importantly, it is the International Year of Astronomy, celebrating 400 orbits since Galileo started to look through a telescope.

[MADS] multiscale modeling

A few scientists in our group work on estimating the intensities of gamma ray observations from sky surveys. This work distinguishes from typical image processing which mostly concerns the point estimation of intensity at each pixel location and the size of overall white noise type error. Often times you will notice from image processing that the orthogonality between errors and sources, and the white noise assumptions. These assumptions are typical features in image processing utilities and modules. On the other hand, CHASC scientists relate more general and broad statistical inference problems in estimating the intensity map, like intensity uncertainties at each point and the scientifically informative display of the intensity map with uncertainty according to the Poisson count model and constraints from physics and the instrument, where the field, multiscale modeling is associated. …Continue reading»

Bipartisanship

We have seen the word “bipartisan” often during the election and during the on-going recession period. Sometimes, I think that the bipartisanship is not driven by politicians but it’s driven by media, commentator, and interpreters. …Continue reading»

Wapedia

I do not rely much on my cell phone. It functions as a tool for confronting emergencies. On the other hand, it seems like people do lots of things with their smart phones and I like to add one thing to your “what I do with my phone.” …Continue reading»

[MADS] HMM

MADS stands for “Missing in ADS.” Every astronomer, I believe, knows what ADS is. As we have [EotW] series and used to have [ArXiv] series, creating a new series for semi-periodic postings under the well known name ADS seems interesting. …Continue reading»

Meet at January AAS meeting to organize a white paper for Astro2010

Hello Sloggers,

Every decade, the National Research Council (under the auspices of the National Academies) convenes a panel to survey the state of astronomy and astrophysics, and to recommend plans and funding priorities for the subsequent decade. The resulting Decadal Survey document has a profound influence on funding of astronomy research at every level. The process for the 2010 decadal survey has begun; Roger Blandford will discuss it at the January 2009 AAS meeting (AAS decadal survey session, Tues, 6 Jan, 8:30am). The National Academies web site hosts a page for the Astro2010 Decadal Survey with more information.

White papers authored by individuals and groups in the astronomical community are a major source of input for the review panel. I would like to lead the effort on a collaborative white paper urging explicit, targeted support for (interdisciplinary) astrostatistics research (perhaps broadened to “astroinformatics” or “astronomical data analysis”). I would like to meet with any of you who would like to co-sign such a white paper, and help author it (as your resources allow). I think the AAS meeting offers a great opportunity for us to meet in person to start fleshing out ideas for the white paper, to be subsequently fleshed out via online interaction.

Here I’d like to discuss when to meet at AAS. Note that some Sloggers are participating in an astrostatistics special session, “Meaning from Surveys and Population Studies”, Monday, 2-3:30pm. In principle, since some of us will already be gathered there, it could make sense to meet afterward somewhere; but there are important prize lectures right afterward that I, for one, would like to hear. Other possibilities include lunch or dinner that day (Monday), or perhaps lunch or dinner the next day, after we’ve all heard Roger Blandford’s presentation on how the survey will work this year.

I have some concrete ideas for the white paper, and I’m sure some of you do, too. But here and now, let’s not get into content; let’s just organize a meeting at AAS.

With that, the floor is open for suggestions on a good meeting time/venue.

Borel Cantelli Lemma for the Gaussian World

Almost two year long scrutinizing some publications by astronomers gave me enough impression that astronomers live in the Gaussian world. You are likely to object this statement by saying that astronomers know and use Poisson, binomial, Pareto (power laws), Weibull, exponential, Laplace (Cauchy), Gamma, and some other distributions.[1] This is true. I witness that these distributions are referred in many publications; however, when it comes to obtaining “BEST FIT estimates for the parameters of interest” and “their ERROR (BARS)”, suddenly everything goes back to the Gaussian world.[2]

Borel Cantelli Lemma (from Planet Math): because of mathematical symbols, a link was made but any probability books have the lemma with proofs and descriptions.

…Continue reading»

  1. It is a bit disappointing fact that not many mention the t distribution, even though less than 30 observations are available.[]
  2. To stay off this Gaussian world, some astronomers rely on Bayesian statistics and explicitly say that it is the only escape, which is sometimes true and sometimes not – I personally weigh more that Bayesians are not always more robust than frequentist methods as opposed to astronomers’ discussion about robust methods.[]

[SPS] Testing Completeness

There will be a special session at the 213th AAS meeting on meaning from surveys and population studies (SPS). Until then, it might be useful to pull out some interesting and relevant papers and questions/challenges as a preliminary to the meeting. I will not list astronomical catalogs and surveys only, which are literally countless these days but will bring out some if they change the way how science is performed with a description of the catalog (the best example would be SDSS, Sloan Digital Sky Survey, to my knowledge). …Continue reading»