Archive for the ‘Misc’ Category.

A Data Miner’s Story

gamma function (Equation of the Week)

The gamma function [not the Gamma -- note upper-case G -- which is related to the factorial] is one of those insanely useful functions that after one finds out about it, one wonders “why haven’t we been using this all the time?” It is defined only on the positive non-negative real line, is a highly flexible function that can emulate almost any kind of skewness in a distribution, and is a perfect complement to the Poisson likelihood. In fact, it is the conjugate prior to the Poisson likelihood, and is therefore a natural choice for a prior in all cases that start off with counts. Continue reading ‘gamma function (Equation of the Week)’ »

tests of fit for the Poisson distribution

Scheming arXiv:astro-ph abstracts almost an year never offered me an occasion that the fit of the Poisson distribution is tested in different ways, instead it is taken for granted by plugging data and (source) model into a (modified) χ2 function. If any doubts on the Poisson distribution occur, the following paper might be useful: Continue reading ‘tests of fit for the Poisson distribution’ »

AstroGrid Desktop Suite

AstroGrid Desktop Suite is available. Check the AstroGrid website http://www.astrogrid.org for more informations. Continue reading ‘AstroGrid Desktop Suite’ »

Significance of 5 counts

We have talked about it many times. Now I have to work with the reality. My source shows only 5 counts in a short 5 ksec Chandra exposure. Is this a detection of the source? or is this a random fluctuation? Chandra background is low and data are intrinsically Poisson, so the problem should be easy to solve. Not really! There is no tool to calculate this :-) well, no actually it is! Tom A. and I found it by searching Google “Python gamma function” and came out with Tom Loredo’s Python functions (sp_funcs.py) that he translated from Numerical Recipes to Python. This is the working tool! We just needed to change “import Numeric” or “import Numarray” to “import numpy as N” and then it worked.

We calculated the significance of observing 5 counts given the expected background counts of 0.1 using spfunc.gammp(5,0.1) =8e-8. The detection is highly significant.

Any comments?

Google Sky

For people in the Boston area, a cornucopia of talks on Google Sky in the near future.

  1. Hunting for Needles in Massive Astronomical Data Streams
    Wednesday, April 9, 2008 at 4pm
    Room 330, 60 Oxford St.
    Ryan Scranton, Google Sky Team
  2. Inside Google Sky
    Wednesday, April 9, 2008 at 8pm
    Room 105, Emerson Hall
    Andrew Connolly, Google Sky Team
  3. Sky in Google Earth
    Tuesday, April 15, 2008 at 1pm
    Phillips Auditorium, 60 Garden
    Alberto Conti & Carol Christian, STScI

[ArXiv] 1st week, Apr. 2008

I’m very curious how astronomers began to use Monte Carlo Markov Chain instead of Markov chain Monte Carlo. The more it becomes popular, the more frequently Monte Carlo Markov Chain appears. Anyway, this week, I added non astrostatistical papers in the list: a tutorial, big bang, and biblical theology. Continue reading ‘[ArXiv] 1st week, Apr. 2008’ »

Prof. Brad Efron visits Harvard

Bradley Efron, Stanford University
11:00 AM, Friday, April 4, 2008
Sever Hall Rm. 103
Title: SIMULTANEOUS INFERENCE: WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED
Its abstract and other informations at http://www.stat.harvard.edu/Colloquia_Content/Efron08.pdf
Continue reading ‘Prof. Brad Efron visits Harvard’ »

[Quote] When all the models are wrong

From page 103 of Bayesian Model Selection and Model Averaging by L. Wasserman (2000) Journal of Mathematical Psychology, 44, pp.92-107 Continue reading ‘[Quote] When all the models are wrong’ »

language barrier

Last week, I was at Tufts colloquium and happened to have a conversation with a computer scientist about density based clustering. I understood density as probabilistic density and was recollecting a paper by Fraley and Raftery (Model-Based Clustering, Discriminant Analysis, and Density Estimation, JASA, 2002, 97, p.458) and other similar papers I saw in engineering journals like IEEE transactions. For a few moments, I felt uncomfortable and she explained that density meant “how dense observations are.” Density based clustering was meant to be distance based clustering, like k-means, minimum spanning tree, most likely nonparametric approaches. Continue reading ‘language barrier’ »

Signal Processing and Bootstrap

Astronomers have developed their ways of processing signals almost independent to but sometimes collaboratively with engineers, although the fundamental of signal processing is same: extracting information. Doubtlessly, these two parallel roads of astronomers’ and engineers’ have been pointing opposite directions: one toward the sky and the other to the earth. Nevertheless, without an intensive argument, we could say that somewhat statistics has played the medium of signal processing for both scientists and engineers. This particular issue of IEEE signal processing magazine may shed lights for astronomers interested in signal processing and statistics outside the astronomical society.

IEEE Signal Processing Magazine Jul. 2007 Vol 24 Issue 4: Bootstrap methods in signal processing

This link will show the table of contents and provide links to articles; however, the access to papers requires IEEE Xplore subscription via libraries or individual IEEE memberships). Here, I’d like to attempt to introduce some articles and tutorials.
Continue reading ‘Signal Processing and Bootstrap’ »

Books – a boring title

I have been observing some sorts of misconception about statistics and statistical nomenclature evolution in astronomy, which I believe, are attributed to the lack of references in the astronomical society. There are some textbooks designed for junior/senior science and engineering students, which are likely unknown to astronomers. Example-wise, these books are not suitable, to my knowledge. Although I never expect astronomers to learn standard graduate (mathematical) statistics textbooks, I do wish astronomers go beyond Numerical Recipes (W. H. Press, S. A. Teukolsky, W. T. Vetterling, & B. P. Flannery) and Error Data Reduction and Analysis for the Physical Sciences (P. R. Bevington & D. K. Robinson). Here are some good ones written by astronomers, engineers, and statisticians: Continue reading ‘Books – a boring title’ »

[Quote] Abstract – There are none.

From Guaranteed Margins for LQG Regulartors J.C. Doyle (1978) IEEE Transactions on Automatic Control 23(4), pp. 756- 757

The abstract has one sentence: There are none and the first paragraph of this short paper explains the uniqueness of the abstract: Continue reading ‘[Quote] Abstract – There are none.’ »

On-line Machine Learning Lectures and Notes

I found this website a while ago but haven’t checked until now. They are quite useful by its contents (even pages of the lecture notes are properly flipped for you while the lecture is given). Increasing popularity of machine learning among astronomers will find more use of such lectures. If you have time to learn machine learning and other related subjects, please visit http://videolectures.net/. Specifically classified links to interesting subjects are found by your click. Continue reading ‘On-line Machine Learning Lectures and Notes’ »

[ArXiv] Astronomy Job Market in US

It’s a report about the job market in US.

[astro-ph:0712.2820] The Production Rate and Employment of Ph.D. Astronomers T.S. Metcalfe

Continue reading ‘[ArXiv] Astronomy Job Market in US’ »