The AstroStat Slog » voronoi tessellation http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 [ArXiv] Voronoi Tessellations http://hea-www.harvard.edu/AstroStat/slog/2009/arxiv-voronoi-tessellations/ http://hea-www.harvard.edu/AstroStat/slog/2009/arxiv-voronoi-tessellations/#comments Wed, 28 Oct 2009 14:29:24 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=94 As a part of exploring spatial distribution of particles/objects, not to approximate via Poisson process or Gaussian process (parametric), nor to impose hypotheses such as homogenous, isotropic, or uniform, various nonparametric methods somewhat dragged my attention for data exploration and preliminary analysis. Among various nonparametric methods, the one that I fell in love with is tessellation (state space approaches are excluded here). Computational speed wise, I believe tessellation is faster than kernel density estimation to estimate level sets for multivariate data. Furthermore, conceptually constructing polygons from tessellation is intuitively simple. However, coding and improving algorithms is beyond statistical research (check books titled or key-worded partially by computational geometry). Good news is that for computation and getting results, there are some freely available softwares, packages, and modules in various forms.

As a part of introducing nonparametric statistics, I wanted to write about applications of computation geometry from the nonparametric 2/3 dimensional density estimation perspective. Also, the following article came along when I just began to collect statistical applications in astronomy (my [ArXiv] series). This [arXiv] paper, in fact, initiated me to investigate Voronoi Tessellations in astronomy in general.

[arxiv/astro-ph:0707.2877]
Voronoi Tessellations and the Cosmic Web: Spatial Patterns and Clustering across the Universe
by Rien van de Weygaert

Since then, quite time has passed. In the mean time, I found more publications in astronomy specifically using tessellation as a main tool of nonparametric density estimation and for data analysis. Nonetheless, in general, topics in spatial statistics tend to be unrecognized or almost ignored in analyzing astronomical spatial data (I mean data points with coordinate information). Many seem only utilizing statistics partially or not at all. Some might want to know how often Voronoi tessellation is applied in astronomy. Here, I listed results from my ADS search by limiting tessellation in title key words. :

Then, the topic has been forgotten for a while until this recent [arXiv] paper, which reminded me my old intention for introducing tessellation for density estimation and for understanding large scale structures or clusters (astronomers’ jargon, not the term in machine or statistical learning).

[arxiv:stat.ME:0910.1473] Moment Analysis of the Delaunay Tessellation Field Estimator
by M.N.M van Lieshout

Looking into plots of the papers by van de Weygaert or van Lieshout, without mathematical jargon and abstraction, one can immediately understand what Voronoi and Delaunay Tessellation is (Delaunay Tessellation is also called as Delaunay Triangulation (wiki). Perhaps, you want to check out wiki:Delaunay Tessellation Field Estimator as well). Voronoi tessellations have been adopted in many scientific/engineering fields to describe the spatial distribution. Astronomy is not an exception. Voronoi Tessellation has been used for field interpolation.

van de Weygaert described Voronoi tessellations as follows:

  1. the asymptotic frame for the ultimate matter distribution,
  2. the skeleton of the cosmic matter distribution,
  3. a versatile and flexible mathematical model for weblike spatial pattern, and
  4. a natural asymptotic result of an evolution in which low-density expanding void regions dictate the spatial organization of the Megaparsec universe, while matter assembles in high-density filamentary and wall-like interstices between the voids.

van Lieshout derived explicit expressions for the mean and variance of Delaunay Tessellatoin Field Estimator (DTFE) and showed that for stationary Poisson processes, the DTFE is asymptotically unbiased with a variance that is proportional to the square intensity.

We’ve observed voids and filaments of cosmic matters with patterns of which theory hasn’t been discovered. In general, those patterns are manifested via observed galaxies, both directly and indirectly. Individual observed objects, I believe, can be matched to points that construct Voronoi polygons. They represent each polygon and investigating its distributional properly helps to understand the formation rules and theories of those patterns. For that matter, probably, various topics in stochastic geometry, not just Voronoi tessellation, can be adopted.

There are plethora information available on Voronoi Tessellation such as the website of International Symposium on Voronoi Diagrams in Science and Engineering. Two recent meeting websites are ISVD09 and ISVD08. Also, the following review paper is interesting.

Centroidal Voronoi Tessellations: Applications and Algorithms (1999) Du, Faber, and Gunzburger in SIAM Review, vol. 41(4), pp. 637-676

By the way, you may have noticed my preference for Voronoi Tessellation over Delaunay owing to the characteristics of this centroidal Voronoi that each observation is the center of each Voronoi cell as opposed to the property of Delaunay triangulation that multiple simplices are associated one observation/point. However, from the perspective of understanding the distribution of observations as a whole, both approaches offer summaries and insights in a nonparametric fashion, which I put the most value on.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/arxiv-voronoi-tessellations/feed/ 0
[MADS] Semiparametric http://hea-www.harvard.edu/AstroStat/slog/2009/mads-semiparametric/ http://hea-www.harvard.edu/AstroStat/slog/2009/mads-semiparametric/#comments Mon, 09 Feb 2009 19:16:05 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=1556 There were (only) four articles from ADS whose abstracts contain the word semiparametric (none in titles). Therefore, semiparametric is not exactly [MADS] but almost [MADS]. One would like to say it is virtually [MADS] or quasi [MADS]. By introducing the term and providing rare examples in astronomy, I hope this scarce term semiparametric to be used adequately against its misguidance of astronomers to inappropriate usage for statistical inference with their data.

  • [2006MNRAS.369.1334S]: semiparametric technique based on a maximum likelihood (ML) approach and Voronoi tessellation (VT). Besides, I wonder if Section 3.3, the cluster detection algorithm works similarly to a source detection algorithm in high energy astrophysics if tight photon clusters indicate sources. By the way, what is the definition of sources? Depending on the definitions, determining the right thresholds for detections would change; however, it seems like (brute) Monte Carlo simulations i.e. empirical approaches are employed for setting thresholds. Please, note that my questionnaire is irrelevant to this paper, which I enjoyed reading very much.
  • [2004MNRAS.347.1241S]: similar to the above because of the same methodology, ML, VT, and color slide/filter for cluster detection
  • [2002AJ....123.1807G]: cut and enhance (CE) cluster detection method. From the abstract: The method is semiparametric, since it uses minimal assumptions about cluster properties in order to minimize possible biases. No assumptions are made about the shape of clusters, their radial profile, or their luminosity function. On the contrary, I wish they used nonparametric which seems more proper in a statistical sense instead of semiparametric judging from their methodology description.
  • [2002A%26A...383.1100N]: statistics related keywords: time series; discrete Fourier transform; long range dependence; log-periodogram regression; ordinary least squares; generalized least squares. The semiparametric method section seems too short. Detail accounts are replaced by reference papers from Annals of Statistics. Among 31 references, 15 were from statistics journals and without reading them, average readers will not have a chance to understand the semiparametric approach.

You might want to check out wiki:Semiparametric about semiparametric (model) from the statistics standpoint.

The following books that I checked from libraries some years back related to semiparametric methods, from which you could get more information about semeparametric statistics. Unfortunately, applications and examples in these books are heavily rely on subjects such as public health (epidemiology), bioinformatics, and econometrics.

  • Rupert, Wand, and Carroll (2003) Semiparametric Regression, Cambridge University Press
  • Härdle, Müller, Sperlich, and Werwatz (2004) Nonparametric and Semiparametric Models, Spinger
  • Horowitz (1998) Semiparametric Methods in Econometrics (Lecture Notes in Statistics) , Springer

There seem more recent publications from 2007 and 2008 about semiparametric methods, targeting diverse but focused readers but no opportunities for me to have a look on them. I just want to point out that many occasions we confront that full parametrization of a model is not necessary but those nuisance parameters determines the shape of a sampling distribution for accurate statistical inference. Semiparametric methods described in above papers are very limited from statistics viewpoints. Astronomers can take a way more advantages from various semiparametrical strategies. There are plenty of rooms for developing semiparametric approaches to various astronomical data analysis and inference about the parameters of interest. It is almost unexplored.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/mads-semiparametric/feed/ 0
[ArXiv] 2nd week, Dec. 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-2nd-week-dec-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-2nd-week-dec-2007/#comments Fri, 14 Dec 2007 21:16:47 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-2nd-week-dec-2007/ No shortage in papers~

  • [astro-ph:0712.1038]
    Extended Anomalous Foreground Emission in the WMAP 3-Year Data G. Dobler and D. P. Finkbeiner

  • [astro-ph:0712.1217]
    Generalized statistical models of voids and hierarchical structure in cosmology A. Z. Mekjian

  • [astro-ph:0712.1155]
    The colour-lightcurve shape relation of Type Ia supernovae and the reddening law S. Nobili and A. Goobar

  • [astro-ph:0712.1297]
    The Structure of the Local Supercluster of Galaxies Revealed by the Three-Dimensional Voronoi’s Tessellation Method O. V. Melnyk, A. A. Elyiv, and I. B. Vavilova

  • [astro-ph:0712.1594]
    Photometric Redshifts with Surface Brightness Priors H. F. Stabenau, A. Connolly and B. Jain

  • [stat.ME:0712.1663]
    Efficient Blind Search: Optimal Power of Detection under Computational Cost Constraints N. Meinshausen, P. Bickel and J. Rice

  • [astro-ph:0712.1917]
    Are solar cycles predictable? M. Schuessler

Voronoi Tessellation for nonparametric density estimation (mass distribution in the universe) interest me very much. If you are working on the topic, would you kindly share useful informations or write your thoughts on the subject here?

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-2nd-week-dec-2007/feed/ 0
[ArXiv] SDSS DR6, July 23, 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/#comments Wed, 25 Jul 2007 17:46:38 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6-july-23-2007/ From arxiv/astro-ph:0707.3413
The Sixth Data Release of the Sloan Digital Sky Survey by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and
SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Part of my dissertation was about creating nonparametric multivariate analysis tools with convex hull peeling and I used SDSS DR4 to apply those convex hull peeling tools to explore celestial objects in the multidimensional color space without projections (dimension reduction). SDSS CAS might fulfill the needs of those who are looking for data sets to conduct

  • massive multivariate data analysis,
  • streaming data analysis (strictly, SDSS is not streaming but the data base is updated yearly by adding new observations and depending on memory, streaming data analysis can be easily simulated) and
  • application of his/her new machine learning and statistical multivariate analysis tools for new discoveries.

Particularly, thanks to whole northern hemisphere survey, interesting spatial statistics can be developed such as voronoi tessellation for spatial density estimation. It also provides a vast image reservoir as well as the catalog of massive multivariate spatial data.

Oh, by the way, the paper discusses changes and improvement in the recent data release. The SDSS DR6 includes the complete imaging of the Northern Galactic Cap and contains images and parameters of 287 million objects over 9583 deg^2, and 1.27 million spectra over 7425 deg^2. The photometric calibration has improved with uncertainties of 1% in g,r,i and 2% in u, significantly better than previous data releases. The method of spectrophotometric calibration has changed and resulted 0.35 mags brighter in the spectrophotometric scale. Two independent codes for spectral classifications and redshifts are available as well.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-sdss-dr6/feed/ 1