#### [ArXiv] 3rd week, Apr. 2008

The dichotomy of outliers; detecting outliers to be discarded or to be investigated; statistics that is robust enough not to be influenced by outliers or sensitive enough to alert the anomaly in the data distribution. Although not related, one paper about outliers made me to dwell on what outliers are.

#### Lomb-Scargle periodograms in bioinformatics

A statistical method developed by insightful and brilliant astronomers is used in bioinformatics:
Detecting periodic patterns in unevenly spaced gene expression time series using Lombâ€“Scargle periodograms
by Glynn, Chen, & Mushegian [Click for R code and relevant information] [Paper archive at Bioinformatics]

The conclusion clearly indicates the winning points of the Lomb-Scargle periodograms.

The Lomb-Scargle periodogram algorithm is an effective tool for finding periodic gene expression profiles in microarray data, especially when data may be collected at arbitrary time points or when a significant proportion of data is missing.

My personal wish is that data driven statistical methods by hands on scientists (and their statistical collaborators) are to be used in other disciplines because I believe data sets are likely to share the unknown truth of our one universe.

#### [ArXiv] 2nd week, Apr. 2008

Markov chain Monte Carlo became the most frequent and stable statistical application in astronomy. It will be useful collecting tutorials from both professions.

#### [ArXiv] use of the median

The breakdown point of the mean is asymptotically zero whereas the breakdown point of the median is 1/2. The breakdown point is a measure of the robustness of the estimator and its value reaches up to 1/2. In the presence of outliers, the mean cannot be a good measure of the central location of the data distribution whereas the median is likely to locate the center. Common plug-in estimators like mean and root mean square error may not provide best fits and uncertainties because of this zero breakdown point of the mean. The efficiency of the mean estimator does not guarantee its unbiasedness; therefore, a bit of care is needed prior to plugging in the data into these estimators to get the best fit and uncertainty. There was a preprint from [arXiv] about the use of median last week. Continue reading ‘[ArXiv] use of the median’ »

#### [ArXiv] 1st week, Apr. 2008

I'm very curious how astronomers began to use Monte Carlo Markov Chain instead of Markov chain Monte Carlo. The more it becomes popular, the more frequently Monte Carlo Markov Chain appears.

#### [ArXiv] Pareto Distribution

Astronomy is ruled by Gaussian distribution with a Poisson distribution duchy. From time to time, ranks are awarded to other distributions without their own territories to be governed independently. Among these distributions, Pareto deserves a high rank. There is a preprint of this week on the Pareto distribution: Continue reading ‘[ArXiv] Pareto Distribution’ »

#### Statistics is the study of uncertainty

I began to study statistics with the notion that statistics is the study of information (retrieval) and a part of information is uncertainty which is taken for granted in our random world. Probably, it is the other way around; information is a part of uncertainty. Could this be the difference between Bayesian and frequentist?

The statistician’s task is to articulate the scientist’s uncertainties in the language of probability, and then to compute with the numbers found: cited from Continue reading ‘Statistics is the study of uncertainty’ »

#### [ArXiv]4th week, Mar. 2008

By the way, there is a paper about solar cycle, PCA, ICA, and Lomb-Scargle periodogram.

#### [ArXiv] 3rd week, Mar. 2007

Markov chain Monte Carlo (MCMC) never misses a week from recently astro-ph. A book titled MCMC in astronomy will be a best seller. There are, in addition, very interesting non MCMC preprints. Continue reading ‘[ArXiv] 3rd week, Mar. 2007’ »

#### [ArXiv] 2nd week, Mar. 2008

Warning! The list is long this week but diverse. Some are of CHASC's obvious interest.

#### [ArXiv] 1st week, Mar. 2008

Irrelevant to astrostatistics but interesting for baseball lovers.
[stat.AP:0802.4317] Jensen, Shirley, & Wyner
Bayesball: A Bayesian Hierarchical Model for Evaluating Fielding in Major League Baseball

With the 5th year WMAP data release, there were many WMAP related papers and among them, most statistical papers are listed.

#### [ArXiv] A fast Bayesian object detection

This is a quite long paper that I separated from [Arvix] 4th week, Feb. 2008:
[astro-ph:0802.3916] P. Carvalho, G. Rocha, & M.P.Hobso
A fast Bayesian approach to discrete object detection in astronomical datasets – PowellSnakes I
As the title suggests, it describes Bayesian source detection and provides me a chance to learn the foundation of source detection in astronomy. Continue reading ‘[ArXiv] A fast Bayesian object detection’ »

#### [ArXiv] 4th week, Feb. 2008

In this posting, I added lecture notes on cosmic microwave background (CMB) and Gravitas DVD (animation, I believe). There is another paper I must include but I decide to write a short review separately. Continue reading ‘[ArXiv] 4th week, Feb. 2008’ »

#### [ArXiv] 3rd week, Feb. 2008

It seems like I omit papers deserving attentions from time to time. If you find one, please leave a message. Even better if a summary can be left for a separate posting.

#### [ArXiv] 2nd week, Feb. 2008

Another week went by with astro-ph papers of statistical flavors. Continue reading ‘[ArXiv] 2nd week, Feb. 2008’ »