The AstroStat Slog

Posts tagged ‘nonparametric’

[ArXiv] Voronoi Tessellations

Oct 28th, 2009| 09:29 am | Posted by hlee

As a part of exploring spatial distribution of particles/objects, not to approximate via Poisson process or Gaussian process (parametric), nor to impose hypotheses such as homogenous, isotropic, or uniform, various nonparametric methods somewhat dragged my attention for data exploration and preliminary analysis. Among various nonparametric methods, the one that I fell in love with is tessellation (state space approaches are excluded here). Computational speed wise, I believe tessellation is faster than kernel density estimation to estimate level sets for multivariate data. Furthermore, conceptually constructing polygons from tessellation is intuitively simple. However, coding and improving algorithms is beyond statistical research (check books titled or key-worded partially by computational geometry). Good news is that for computation and getting results, there are some freely available softwares, packages, and modules in various forms. Continue reading ‘[ArXiv] Voronoi Tessellations’ »

Tags: data compression, delanay tessellation, density estimation, image processing, nonparametric, spatial statistics, van de Weygaert, van Lieshout, voronoi tessellation
Category: Algorithms, arXiv, Galaxies, Methods | Comment

Scatter plots and ANCOVA

Oct 15th, 2009| 06:46 pm | Posted by hlee

Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists^[1]. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »

This is not an assuring absolute statement but a personal impression after reading articles of various fields in addition to astronomy. My readings of other fields tell that many rely on correlation statistics but less scatter plots by adding straight lines going through data sets for the purpose of imposing relationships within variable pairs[↩]

Tags: ANCOVA, ANOVA, approximation, correlation, Gaussianity, graphics, MADS, modeling, nonparametric, parallel coordinates, PCA, quality, quantity, regression, scatter plots
Category: arXiv, Cross-Cultural, Fitting, Jargon, Methods, Stat, Uncertainty | Comment

Goodness-of-fit tests

Oct 6th, 2009| 01:49 pm | Posted by hlee

When it comes to applying statistics for measuring goodness-of-fit, the Pearson χ² test is the dominant player in a race and the Kolmogorov-Smirnoff test statistic trails far behind. Although it seems almost invisible in this race, there are more various non-parametric statistics for testing goodness-of-fit and for comparing the sampling distribution to a reference distribution as legitimate race participants trained by many statisticians. Listing their names probably useful to some astronomers when they find the underlying assumptions for the χ² test do not match the data. Perhaps, some astronomers want to try other nonparametric test statistics other than the K-S test. I’ve seen other test statistics in astronomical journals from time to time. Depending on data and statistical properties, one test statistic could work better than the other; therefore, it’s worthwhile to keep the variety in one’s mind that there are other tests beyond the χ² test goodness-of-fit test statistic. Continue reading ‘Goodness-of-fit tests’ »

Tags: goodness-of-fit test, nonparametric, normality, qq plot, tests
Category: Frequentist, Methods, Stat | 1 Comment

[MADS] data depth

Jun 1st, 2009| 09:51 pm | Posted by hlee

How would you assign orders to multivariate data? If you have your strategy to achieve this ordering task, I’d like to ask, “is your strategy affine invariant?” meaning that shift and rotation invariant. Continue reading ‘[MADS] data depth’ »

Tags: break points, data depth, MADS, mean, median, multivariate, nonparametric, order, parasite, quantile, robust, sort, vertebrate
Category: Algorithms, arXiv, Cross-Cultural, Jargon, Stat | Comment

Robust Statistics

May 18th, 2009| 12:18 pm | Posted by hlee

My understandings of “robustness” from the education in statistics and from communicating with astronomers are hard to find a mutual interest. Can anyone help me to build a robust bridge to get over this abyss? Continue reading ‘Robust Statistics’ »

Tags: break point, Huber, nonparametric, robust, Rousseeuw, Tukey
Category: Bayesian, Frequentist, Jargon, MCMC, Methods, Quotes, Stat, Uncertainty | Comment

[MADS] Semiparametric

Feb 9th, 2009| 03:16 pm | Posted by hlee

There were (only) four articles from ADS whose abstracts contain the word semiparametric (none in titles). Therefore, semiparametric is not exactly [MADS] but almost [MADS]. One would like to say it is virtually [MADS] or quasi [MADS]. By introducing the term and providing rare examples in astronomy, I hope this scarce term semiparametric to be used adequately against its misguidance of astronomers to inappropriate usage for statistical inference with their data. Continue reading ‘[MADS] Semiparametric’ »

Tags: MADS, MLE, nonparametric, semiparametric, tessellation, voronoi tessellation
Category: Cross-Cultural, Jargon, Methods, Stat | Comment

missing data

Oct 27th, 2008| 09:24 am | Posted by hlee

The notions of missing data are overall different between two communities. I tend to think missing data carry as good amount of information as observed data. Astronomers…I’m not sure how they think but my impression so far is that a missing value in one attribute/variable from a object/observation/informant, all other attributes related to that object become useless because that object is not considered in scientific data analysis or model evaluation process. For example, it is hard to find any discussion about imputation in astronomical publication or statistical justification of missing data with respect to inference strategies. On the contrary, they talk about incompleteness within different variables. Putting this vague argument with a concrete example, consider a catalog of multiple magnitudes. To draw a color magnitude diagram, one needs both color and magnitude. If one attribute is missing, that star will not appear in the color magnitude diagram and any inference methods from that diagram will not include that star. Nonetheless, one will trying to understand how different proportions of stars are observed according to different colors and magnitudes. Continue reading ‘missing data’ »

Tags: bootstrap, catalog, Efron, estimator, ignorable, imputation, incompleteness, Little, MAR, MCAR, missing data, nonparametric, Rubin, Schafer, survey
Category: Astro, Cross-Cultural, Data Processing, Stat | 2 Comments

[ArXiv] 2nd week, Nov. 2007

Nov 9th, 2007| 12:45 pm | Posted by hlee

There should be at least one paper that drags your attention. Various statistics topics appeared in astro-ph this week.
Continue reading ‘[ArXiv] 2nd week, Nov. 2007’ »

Tags: Bayesian Mixture, Error Estimator, Function Estimation, nonparametric, Periodogram, Photometric Redshift, Spatial, Two-point Correlation, wavelet
Category: arXiv | Comment

[ArXiv] An unbiased estimator, May 29, 2007

Oct 30th, 2007| 03:37 am | Posted by hlee

From arxiv/astro-ph:0705.4199v1
In search of an unbiased temperature estimator for statistically poor X-ray spectra
A. Leccardi and S. Molendi

There was a delay of writing about this paper, which by accident was lying under the pile of papers irrelevant to astrostatistics. (It has been quite overwhelming to track papers with various statistical applications and papers with rooms left for statistical improvements from arxiv:astro-ph). Although there is a posting about this paper (see Vinay’s posting), I’d like to give a shot. I was very excited because I haven’t seen any astronomical papers discussing unbiased estimators solely.
Continue reading ‘[ArXiv] An unbiased estimator, May 29, 2007’ »

Tags: chi-square, maximum likelihood, mixing distribution, mixture, nonparametric, robust, subsampling, transformation, unbiased, Uncertainty
Category: arXiv, Frequentist, Stat | Comment

Astrostatistics: Goodness-of-Fit and All That!

Aug 14th, 2007| 10:17 pm | Posted by hlee

During the International X-ray Summer School, as a project presentation, I tried to explain the inadequate practice of χ^2 statistics in astronomy. If your best fit is biased (any misidentification of a model easily causes such bias), do not use χ^2 statistics to get 1σ error for the 68% chance of capturing the true parameter.

Later, I decided to do further investigation on that subject and this paper came along: Astrostatistics: Goodness-of-Fit and All That! by Babu and Feigelson.
Continue reading ‘Astrostatistics: Goodness-of-Fit and All That!’ »

Tags: Anderson-Darling, Babu, best-fit, bias, bootstrap, chi-square, Cramer-von Mises, Feigelson, Kolmogorov-Smirnoff, Kullback-Leibler distance, nonparametric, parametric, resampling
Category: Algorithms, arXiv, Astro, Fitting, High-Energy, Methods, Spectral, Stat | 7 Comments

[ArXiv] SDSS DR6, July 23, 2007

Jul 25th, 2007| 01:46 pm | Posted by hlee

From arxiv/astro-ph:0707.3413
The Sixth Data Release of the Sloan Digital Sky Survey by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and
SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.
Continue reading ‘[ArXiv] SDSS DR6, July 23, 2007’ »

Tags: catalog, convex hull peeling, density estimation, DR6, massive data, multivariate analysis, nonparametric, SDSS, SQL, voronoi tessellation
Category: Algorithms, arXiv, Astro, Data Processing, Misc, Optical | 1 Comment

[ArXiv] Voronoi Tessellations

Scatter plots and ANCOVA

Goodness-of-fit tests

[MADS] data depth

Robust Statistics

[MADS] Semiparametric

missing data

[ArXiv] 2nd week, Nov. 2007

[ArXiv] An unbiased estimator, May 29, 2007

Astrostatistics: Goodness-of-Fit and All That!

[ArXiv] SDSS DR6, July 23, 2007

Admin

Recent Posts

Recent Comments

Category Cloud

Blogroll

Links