Oct 27th, 2008| 09:24 am | Posted by hlee

The notions of **missing data** are overall different between two communities. I tend to think missing data carry as good amount of information as observed data. Astronomers…I’m not sure how they think but my impression so far is that a missing value in one attribute/variable from a object/observation/informant, all other attributes related to that object become useless because that object is not considered in scientific data analysis or model evaluation process. For example, it is hard to find any discussion about **imputation** in astronomical publication or statistical justification of missing data with respect to inference strategies. On the contrary, they talk about **incompleteness** within different variables. Putting this vague argument with a concrete example, consider a catalog of multiple magnitudes. To draw a color magnitude diagram, one needs both color and magnitude. If one attribute is missing, that star will not appear in the color magnitude diagram and any inference methods from that diagram will not include that star. Nonetheless, one will trying to understand how different proportions of stars are observed according to different colors and magnitudes. Continue reading ‘missing data’ »

Tags:

bootstrap,

catalog,

Efron,

estimator,

ignorable,

imputation,

incompleteness,

Little,

MAR,

MCAR,

missing data,

nonparametric,

Rubin,

Schafer,

survey Category:

Astro,

Cross-Cultural,

Data Processing,

Stat |

2 Comments
Jul 8th, 2008| 07:27 pm | Posted by hlee

Astronomers confront with various censored and truncated data. Often these types of data are called after famous scientists who generalized them, like Eddington bias. When these censored or truncated data become the subject of study in statistics, instead of naming them, statisticians try to model them so that the uncertainty can be quantified. This area is called **survival analysis**. If your library has *The American Statistician* subscription and you are an astronomer handles censored or truncated data sets, this primer would be useful for briefly conceptualizing statistics jargon in survival analysis and for characterizing uncertainties residing in your data. Continue reading ‘Survival Analysis: A Primer’ »

Tags:

censored,

Efron,

Feigelson,

Freedman,

massive data,

Nelson,

Petrosian,

survival analysis,

truncated Category:

arXiv,

Fitting,

Stat |

4 Comments
Dec 31st, 2007| 08:48 pm | Posted by hlee

** The Bootstrap and Modern Statistics ** Brad Efron (2000), JASA Vol. 95 (452), p. 1293-1296.

If the bootstrap is an automatic processor for frequentist inference, then MCMC is its Bayesian counterpart.

Continue reading ‘[Quote] Bootstrap and MCMC’ »