Jul 27th, 2007| 02:46 pm | Posted by aconnors

Peter Bickel:

“Bayesian” methods have, I think, rightly gained favor in astronomy

as they have in other fields of statistical application. I put “Bayesian” in quotation marks because I do not believe this marks a revival in the sciences in the belief in personal probability. To me it rather means that all information on hand should be used

in model construction, coupled with the view of Box[1979 etc], who considers himself a Bayesian:

Models, of course, are never true but fortunately it is only necessary that they be useful.

The Bayesian paradigm permits one to construct models and hence statistical methods which reflect such information in an, at least in principle, marvellously simple way. A frequentist such as myself feels as at home with these uses of Bayes principle

as any Bayesian.

From Bickel, P. J. “An Overview of SCMA II”, in Statistical Challenges in Modern Astronomy II, editors G. Jogesh Babu and Eric D. Feigelson, 1997, Springer-Verlag, New York,p 360.

[Box 1979] Box, G. E. P. , 1979, “Some Problems of statistics and everyday life”. J. Amer. Statst. Assoc., 74, 1-4.

Peter Bickle had so many interesting perspectives in his comments at these SCMA conferences that it was hard to choose just one set.

Jul 25th, 2007| 01:46 pm | Posted by hlee

From arxiv/astro-ph:0707.3413

**The Sixth Data Release of the Sloan Digital Sky Survey** by … many people …

The sixth data release of the Sloan Digital Sky Survey (SDSS DR6) is available at http://www.sdss.org/dr6. Additionally, Catalog Archive Service (CAS) and

SQL interface to access the catalog would be useful to data searching statisticians. Simple SQL commends, which are well documented, could narrow down the size of data and the spatial coverage.

Continue reading ‘[ArXiv] SDSS DR6, July 23, 2007’ »

Tags:

catalog,

convex hull peeling,

density estimation,

DR6,

massive data,

multivariate analysis,

nonparametric,

SDSS,

SQL,

voronoi tessellation Category:

Algorithms,

arXiv,

Astro,

Data Processing,

Misc,

Optical |

1 Comment
Jul 25th, 2007| 03:22 am | Posted by hlee

From arxiv/astro-ph:0705.4020v2

**Statistical Evidence for Three classes of Gamma-ray Bursts** by T. Chattopadhyay et. al.

In general, gamma-ray bursts (GRBs) are classified into two groups: long (>2 sec) and short (<2 sec) duration bursts. Nonetheless, there have been some studies including arxiv/astro-ph:0705.4020v2 that statistically proved the optimal existence of 3 clusters. The pioneer work of GRB clusterings was based on hierarchical clustering methods by Mukerjee et. al.(Three Types of Gamma-Ray Bursts)

Continue reading ‘[ArXiv] Three Classes of GRBs, July 21, 2007’ »

Jul 25th, 2007| 02:28 am | Posted by hlee

Since I began to subscribe arxiv/astro-ph abstracts, from an astrostatistical point of view, one of the most frequent topics has been **photometric redshifts**. This photometric redshift has been a popular topic as the catalog of remote photometric object observation multiplies its volume and sky survey projects in multiple bands lead to virtual observatories (VO – will discuss in the later posting). Just searching by **photometric redshifts** in google scholar and arxiv.org provides more than 2000 articles since 2000.

Continue reading ‘Photometric Redshifts’ »

Tags:

cosmology,

distance estimation,

Lutz-Kelker bias,

machine learning,

Malmquist bias,

Photometric Redshift,

spectrum,

survey,

VO Category:

Algorithms,

arXiv,

Data Processing,

Galaxies,

Stat |

Comment
Jul 19th, 2007| 11:01 pm | Posted by aconnors

Ten years ago, Astrophysicist John Nousek had this answer to Hyunsook Lee’s question “What is so special about chi square in astronomy?”:

The astronomer must also confront the problem that results need to be published and defended. If a statistical technique has not been widely applied in astronomy before, then there are additional burdens of convincing the journal referees and the community at large that the statistical methods are valid.

Certain techniques which are widespread in astronomy and seem to be accepted without any special justification are: linear and non-linear regression (Chi-Square analysis in general), Kolmogorov-Smirnov tests, and bootstraps. It also appears that if you find it in Numerical Recipes (Press etal. 1992) that it will be more likely to be accepted without comment.

…Note an insidious effect of this bias, astronomers will often choose to utilize a widely accepted statistical tool, even into regimes where the tool is known to be invalid, just to avoid the problem of developping or researching appropriate tools.

From pg 205, in “Discussion by John Nousek” (of Edward J. Wegman et. al., “Statistical Software, Siftware, and Astronomy”), in Statistical Challenges in Modern Astronomy II”, editors G. Jogesh Babu and Eric D. Feigelson, 1997, Springer-verlag, New York.

Jul 18th, 2007| 01:04 am | Posted by hlee

From arxiv/astro-ph:0707.2474,

**Visualization, Exploration and Data Analysis of Complex Astrophysical Data** by Comparato, Becciani, Costa, Larsson, Garilli, Gheller, and Taylor

This paper introduces a novel advanced visualization tool VisIVO,^{[1]} its advantages from combining a protocol called PLASTIC (Platform for Astronomy Tool Interconnection) for displaying and extracting information from astrophysical data, its enhanced connection to VO (Virtual Observatory), and its usage in several scientific cases. Continue reading ‘[ArXiv] Data Visualization, July 17, 2007’ »

Jul 16th, 2007| 03:31 pm | Posted by hlee

From arxiv/astro-ph:0707.2064v1

**Star Formation via the Little Guy: A Bayesian Study of Ultracool Dwarf Imaging Surveys for Companions ** by P. R. Allen.

I rather skip all technical details on ultracool dwarfs and binary stars, reviews on star formation studies, like initial mass function (IMF), astronomical survey studies, which Allen gave a fair explanation in arxiv/astro-ph:0707.2064v1 but want to emphasize that based on simple **Bayes’ rule** and careful set-ups for **likelihoods** and **priors** according to data (ultracool dwarfs), quite informative conclusions were drawn:

Continue reading ‘[ArXiv] Bayesian Star Formation Study, July 13, 2007’ »

Tags:

Bayesian,

binary,

dwarfs,

IMF,

likelihood,

prior,

star formation,

survey,

upper limit Category:

arXiv,

Bayesian,

Objects |

1 Comment
Jul 16th, 2007| 01:30 pm | Posted by hlee

From arxiv/astro-ph:0707.1982v1,

**Nflation: observable predictions from the random matrix mass spectrum** by Kim and Liddle

To my knowledge, random matrix received statisticians’ interests fairly recently and SAMSI (Statistical and Applied Mathematical Sciences Institute) offered a semester long program on High Dimensional Inference and Random Matrices (tutorials and lecture notes can be found) during Fall 2006 . However, my knowledge is very limited to make a comment or critic on Kim and Liddle’s paper. Clearly, nonetheless, this paper is not about random matrix theory but about its straightforward application to the cosmological model viability.

Continue reading ‘[ArXiv] Random Matrix, July 13, 2007’ »

Jul 16th, 2007| 12:15 pm | Posted by hlee

From arxiv/astro-ph:0707.1900v1

** The complete catalogue of gamma-ray bursts observed by the Wide Field Cameras on board BeppoSAX ** by Vetere, et.al.

This paper intend to publicize the largest data set of Gamma Ray Burst (GRB) X-ray afterglows (right curves after the event), which is available from http://www.asdc.asi.it. It is claimed to be a complete on-line catalog of GRB observed by two wide-Field Cameras on board BeppoSAX (Click for its Wiki) in the period of 1996-2002. It is comprised with 77 bursts and 56 GRBs with Xray light curves, covering the energy range 40-700keV. A brief introduction to the instrument, data reduction, and catalog description is given.

Tags:

afterglow,

BeppoSAX,

catalog,

GRB,

light curve Category:

arXiv,

Data Processing,

gamma-ray,

Objects,

Spectral,

Timing,

X-ray |

1 Comment
Jul 13th, 2007| 07:24 pm | Posted by hlee

From arxiv/astro-ph: 0707.1611 **Probabilistic Cross-Identification of Astronomical Sources **by Budavari and Szalay

As multi-wave length studies become more popular, various source matching methodologies have been discussed. One of such methods particularly focused on Bayesian idea was introduced by Budavari and Szalay with a demand for symmetric algorithms in a unified framework.

Continue reading ‘[ArXiv] Matching Sources, July 11, 2007’ »

Tags:

Bayes factor,

evidence,

Matching,

multi-wavelength,

Multiple Testing Category:

Algorithms,

arXiv,

Bayesian,

Data Processing,

Frequentist,

Objects,

Quotes,

Uncertainty |

1 Comment
Jul 12th, 2007| 03:37 pm | Posted by aconnors

This is from the very interesting Ingrid Daubechies interview by Dorian Devins,

www.nasonline.org/interviews_daubechies, National Academy of Sciences, U.S.A., 2004. It is from part 6, where Ingrid Daubechies speaks of her early mathematics paper on wavelets. She tries to put the impact into context:

I really explained in the paper where things came from. Because, well, the mathematicians wouldn’t have known. I mean, to them this would have been a question that really came out of nowhere. So, I had to explain it …

I was very happy with [the paper]; I had no inkling that it would take off like that… [Of course] the wavelets themselves are used. I mean, more than even that. I explained in the paper how I came to that. I explained both [a] mathematicians way of looking at it and then to some extent the applications way of looking at it. And I think engineers who read that had been emphasizing a lot the use of Fourier transforms. And I had been looking at the spatial domain. It generated a different way of considering this type of construction. I think, that was the major impact. Because then other constructions were made as well. But I looked at it differently. A change of paradigm. Well, paradigm, I never know what that means. A change of … a way of seeing it. A way of paying attention.

Jul 12th, 2007| 12:02 am | Posted by hlee

Since I start reading arxiv/astro-ph abstracts and a few relevant papers about a month ago, so often I see chi-square something as an optimization or statistical inference tool. Chi-square function, chi-square statistics, chi-square goodness-of-fit test are the words that serve different data analysis purposes but under the same prefix. As a newbie to statistics, although I learned chi-square distribution and chi-square test, doing statistics with chi-square are somewhat considered to be obsolete in terms of robust applications to modern data. These are introduced as one of many distributions and statistical tests. Nothing special. However, in astronomy, chi-square becomes the almost only method for statistical data analysis. I wonder how such strong bond between chi-square tactics and astronomer’s keen mind to data analysis has happened?

Continue reading ‘What is so special about chi square in astronomy?’ »

Jul 11th, 2007| 11:50 am | Posted by vlk

Hyunsook and I have preliminary findings (work done with the help of the X-Atlas group) on the efficacy of using spectral proxies to classify low-mass coronal sources, put up as a poster at the XGratings workshop. The workshop has a “poster haiku” session, where one may summarize a poster in a single transparency and speak on it for a couple of minutes. I cannot count syllables, so I wrote a limerick instead: Continue reading ‘Summarizing Coronal Spectra’ »

Tags:

2007,

dendrograms,

limerick,

PCA,

workshop,

X-Atlas,

XAtlas,

XGratings Category:

Astro,

News,

Quotes,

Spectral,

Stars,

X-ray |

Comment
Jul 10th, 2007| 12:20 pm | Posted by hlee

This is the cover article in the science section of New York Times (July 10, 2007) and talks about Digital Access to a Sky Century at Harvard.

*Click the link to see the article. You may need to sign-up for an access but it’s free.

Jul 5th, 2007| 04:13 pm | Posted by aconnors

Jeff Scargle (in person [top] and in wavelet transform [bottom], left) weighs in on our continuing discussion on how well “automated fitting”/”Machine Learning” can really work (private communication, June 28, 2007):

It is clearly wrong to say that automated fitting of models to data is impossible. Such a view ignores progress made in the area of machine learning and data mining. Of course there can be problems, I believe mostly connected with two related issues:

* Models that are too fragile (that is, easily broken by unusual data)

* Unusual data (that is, data that lie in some sense outside the arena that one expects)

The antidotes are:

(1) careful study of model sensitivity

(2) if the context warrants, preprocessing to remove “bad” points

(3) lots and lots of trial and error experiments, with both data sets that are as realistic as possible and ones that have extremes (outliers, large errors, errors with unusual properties, etc.)

Trial … error … fix error … retry …

You can quote me on that.

This ilustration is from Jeff Scargle’s First GLAST Symposium (June 2007) talk, pg 14, demonstrating the use of inverse area of Voroni tesselations, weighted by the PSF density, as an automated measure of the density of Poisson Gamma-Ray counts on the sky.

Category:

Algorithms,

Astro,

Data Processing,

gamma-ray,

High-Energy,

Imaging,

Methods,

Quotes,

Stat,

Timing,

X-ray |

1 Comment