Archive for the ‘arXiv’ Category.
Dec 27th, 2009| 10:13 pm | Posted by hlee
I often feel irksome whenever I see a function being normalized over a feasible parameter space and it being used as a probability density function (pdf) for further statistical inference. In order to be a suitable pdf, normalization has to be done over a measurable space not over a feasible space. Such practice often yields biased best fits (biased estimators) and improper error bars. On the other hand, validating a measurable space under physics seems complicated. To be precise, we often lost in translation. Continue reading ‘A short note on Probability for astronomers’ »
Tags:
axiom,
curriculum,
education,
google university,
hope,
measurable,
probability Category:
Algorithms,
arXiv,
Cross-Cultural,
Jargon,
Methods,
Quotes,
Stat,
Uncertainty |
Comment
Dec 22nd, 2009| 09:13 pm | Posted by hlee
Because of blogging and projects I worked on, I happened to collect quite many publications in Astronomy. The collection is biased toward my personal interests. However, these authors discussed statistics in a wide range. So, I felt my astronomical bibliography can be useful to slog audience. Some areas could match your interests. Or your own name can be found. Continue reading ‘astronomy bibliography’ »
Dec 20th, 2009| 07:27 pm | Posted by hlee
Tags:
assumptions,
bulletin,
IMS,
maximum likelihood,
MLE,
T. Speed Category:
arXiv,
Cross-Cultural,
Fitting,
Frequentist,
Methods,
Stat |
1 Comment
Dec 10th, 2009| 04:18 pm | Posted by hlee
When I begin to subscribe arXiv/astro-ph and arXiv/stat, although only for a year I listed astro-ph papers featuring relatively advanced statistics, I also kept more papers relevant to astrostatistics beyond astro-ph or introducing hot topics in statistics and computer science for astronomical data applications. While creating my own arXiv as follows, I had a hope to write up short introductions of statistics that are unlikely known to most of astronomers (like my MADS) and matching subjects/targets in astronomy. I thought such effort could spawn new collaborations or could expand understanding of statistics among astronomers (see Magic Crystal). Well, I couldn’t catch up the growth rate and it’s about time to terminate the hope. However, I thought some papers can be useful to some slog subscribers. I hope they do. Continue reading ‘arxiv list’ »
Dec 7th, 2009| 11:46 pm | Posted by hlee
He was one of the frequently cited statisticians in this slog because of his influence in statistics. It is extremely difficult to avoid his textbooks and his establishment of theoretical statistics when one begins to comprehend and to appreciate the modern theoretical statistics. To me, Testing Statistical Hypotheses and Theory of Point Estimation are two pillars of graduate statistical education. In addition, Elements of Large Sample Theory and Nonparametrics: Statistical Methods Based on Ranks are also eye openers. Continue reading ‘Erich Lehmann’ »
Nov 21st, 2009| 05:06 am | Posted by hlee
by Emanuel Parzen in Statistical Science 2004, Vol 19(4), pp.652-662 JSTOR
I teach that statistics (done the quantile way) can be simultaneously frequentist and Bayesian, confidence intervals and credible intervals, parametric and nonparametric, continuous and discrete data. My first step in data modeling is identification of parametric models; if they do not fit, we provide nonparametric models for fitting and simulating the data. The practice of statistics, and the modeling (mining) of data, can be elegant and provide intellectual and sensual pleasure. Fitting distributions to data is an important industry in which statisticians are not yet vendors. We believe that unifications of statistical methods can enable us to advertise, “What is your question? Statisticians have answers!”
I couldn’t help liking this paragraph because of its bitter-sweetness. I hope you appreciate it as much as I did.
Oct 28th, 2009| 09:29 am | Posted by hlee
As a part of exploring spatial distribution of particles/objects, not to approximate via Poisson process or Gaussian process (parametric), nor to impose hypotheses such as homogenous, isotropic, or uniform, various nonparametric methods somewhat dragged my attention for data exploration and preliminary analysis. Among various nonparametric methods, the one that I fell in love with is tessellation (state space approaches are excluded here). Computational speed wise, I believe tessellation is faster than kernel density estimation to estimate level sets for multivariate data. Furthermore, conceptually constructing polygons from tessellation is intuitively simple. However, coding and improving algorithms is beyond statistical research (check books titled or key-worded partially by computational geometry). Good news is that for computation and getting results, there are some freely available softwares, packages, and modules in various forms. Continue reading ‘[ArXiv] Voronoi Tessellations’ »
Tags:
data compression,
delanay tessellation,
density estimation,
image processing,
nonparametric,
spatial statistics,
van de Weygaert,
van Lieshout,
voronoi tessellation Category:
Algorithms,
arXiv,
Galaxies,
Methods |
Comment
Oct 22nd, 2009| 07:08 pm | Posted by hlee
[arXiv:stat.ME:0910.2585]
Variable Selection and Updating In Model-Based Discriminant Analysis for High Dimensional Data with Food Authenticity Applications
by Murphy, Dean, and Raftery
Classifying or clustering (or semi supervised learning) spectra is a very challenging problem from collecting statistical-analysis-ready data to reducing the dimensionality without sacrificing complex information in each spectrum. Not only how to estimate spiky (not differentiable) curves via statistically well defined procedures of estimating equations but also how to transform data that match the regularity conditions in statistics is challenging.
Continue reading ‘[ArXiv] classifying spectra’ »
Tags:
BIC,
Classification,
clustering,
cross-validation,
curse of dimensionality,
discriminant analysis,
graphical model,
mclust,
model based,
semi-supervised learning,
statistical learning,
variable selection Category:
Algorithms,
arXiv,
Cross-Cultural,
Data Processing,
Jargon,
Methods,
Spectral,
Stat |
Comment
Oct 15th, 2009| 06:46 pm | Posted by hlee
Astronomers rely on scatter plots to illustrate correlations and trends among many pairs of variables more than any scientists[]. Pages of scatter plots with regression lines are often found from which the slope of regression line and errors bars are indicators of degrees of correlation. Sometimes, too many of such scatter plots makes me think that, overall, resources for drawing nice scatter plots and papers where those plots are printed are wasted. Why not just compute correlation coefficients and its error and publicize the processed data for computing correlations, not the full data, so that others can verify the computation results for the sake of validation? A couple of scatter plots are fine but when I see dozens of them, I lost my focus. This is another cultural difference. Continue reading ‘Scatter plots and ANCOVA’ »
Tags:
ANCOVA,
ANOVA,
approximation,
correlation,
Gaussianity,
graphics,
MADS,
modeling,
nonparametric,
parallel coordinates,
PCA,
quality,
quantity,
regression,
scatter plots Category:
arXiv,
Cross-Cultural,
Fitting,
Jargon,
Methods,
Stat,
Uncertainty |
Comment
Oct 13th, 2009| 03:15 pm | Posted by hlee
Although a bit of time has elapsed since my post space weather, saying that logistic regression is used for prediction, it looks like still true that logistic regression is rarely used in astronomy. Otherwise, it could have been used for the similar purpose not under the same statistical jargon but under the Bayesian modeling procedures. Continue reading ‘[MADS] logistic regression’ »
Oct 1st, 2009| 10:18 pm | Posted by hlee
I decide to discuss Kalman Filter a while ago for the slog after finding out that this popular methodology is rather underrepresented in astronomy. However, it is not completely missing from ADS. I see that the fulltext search and all bibliographic source search shows more results. Their use of Kalman filter, though, looked similar to the usage of “genetic algorithms” or “Bayes theorem.” Probably, the broad notion of Kalman filter makes it difficult my finding Kalman Filter applications by its name in astronomy since often wheels are reinvented (algorithms under different names have the same objective). Continue reading ‘[MADS] Kalman Filter’ »
Tags:
Cressie,
inference,
Kalman filter,
kriging,
MADS,
spatial statistics Category:
arXiv,
Astro,
Cross-Cultural,
Data Processing,
Imaging,
Jargon |
Comment
Sep 22nd, 2009| 12:03 pm | Posted by hlee
Thanks to a Korean solar physicist[] I was able to gather the following websites and some relevant information on Space Weather Forecast in action, not limited to literature nor toy data.
Continue reading ‘More on Space Weather’ »
Tags:
automatic,
CME,
computer vision,
data mining,
feature detection,
filament,
image processing,
machine learning,
manifold,
space weather,
statistical learning,
sunspot,
SVM Category:
Algorithms,
arXiv,
Cross-Cultural,
Data Processing,
Imaging,
Jargon |
Comment
Sep 8th, 2009| 10:17 am | Posted by hlee
I happened to observe a surge of principle component analysis (PCA) and independent component analysis (ICA) applications in astronomy. The PCA and ICA is used for separating mixed components with some assumptions. For the PCA, the decomposition happens by the assumption that original sources are orthogonal (uncorrelated) and mixed observations are approximated by multivariate normal distribution. For ICA, the assumptions is sources are independent and not gaussian (it grants one source component to be gaussian, though). Such assumptions allow to set dissimilarity measures and algorithms work toward maximize them. Continue reading ‘[ArXiv] component separation methods’ »
Sep 1st, 2009| 07:43 pm | Posted by hlee
[arxiv:0906.3662] The Statistical Analysis of fMRI Data by Martin A. Lindquist
Statistical Science, Vol. 23(4), pp. 439-464
This review paper offers some information and guidance of statistical image analysis for fMRI data that can be expanded to astronomical image data. I think that fMRI data contain similar challenges of astronomical images. As Lindquist said, collaboration helps to find shortcuts. I hope that introducing this paper helps further networking and collaboration between statisticians and astronomers.
List of similarities Continue reading ‘[ArXiv] Statistical Analysis of fMRI Data’ »
Tags:
data aquisition,
experimental design,
fMRI,
ICA,
image analysis,
image processing,
localization,
modeling,
pipeline,
preprocessing,
similarities,
Spatial,
temporal,
time series,
voxel Category:
arXiv,
Cross-Cultural,
Data Processing,
Imaging,
Jargon,
Methods,
Stat |
Comment
Aug 25th, 2009| 09:19 pm | Posted by hlee
Kriging is the first thing that one learns from a spatial statistics course. If an astronomer sees its definition and application, almost every astronomer will say, “Oh, I know this! It is like the 2pt correlation function!!” At least this was my first impression when I first met kriging.
There are three distinctive subjects in spatial statistics: geostatistics, lattice data analysis, and spatial point pattern analysis. Because of the resemblance between the spatial distribution of observations in coordinates and the notion of spatially random points, spatial statistics in astronomy has leaned more toward the spatial point pattern analysis than the other subjects. In other fields from immunology to forestry to geology whose data are associated spatial coordinates of underlying geometric structures or whose data were sampled from lattices, observations depend on these spatial structures and scientists enjoy various applications from geostatistics and lattice data analysis. Particularly, kriging is the fundamental notion in geostatistics whose application is found many fields. Continue reading ‘[MADS] Kriging’ »
Tags:
BLUP,
book,
books,
CMB,
Cressie,
Diggle,
geostatistics,
hierarchical model,
kriging,
MADS,
point pattern analysis,
sparse,
spatial statistics,
Stein,
WMAP Category:
arXiv,
Astro,
Imaging,
Jargon,
Methods,
Stat |
Comment