The AstroStat Slog » distribution http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 Guinness, Gosset, Fisher, and Small Samples http://hea-www.harvard.edu/AstroStat/slog/2009/guinness-gosset-fisher-and-small-samples/ http://hea-www.harvard.edu/AstroStat/slog/2009/guinness-gosset-fisher-and-small-samples/#comments Thu, 12 Feb 2009 18:03:01 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=1619 Student’s t-distribution is somewhat underrepresented in the astronomical community. Having an article with nice stories, it looks to me the best way to introduce the t distribution. This article describing historic anecdotes about monumental statistical developments occurred about 100 years ago.

Guinness, Gosset, Fisher, and Small Samples by Joan Fisher Box
Source: Statist. Sci. Volume 2, Number 1 (1987), 45-52.

No time for reading the whole article? I hope you have a few minutes to read following quotes, which are quite enchanting to me.

[p.45] One of the first things you learn in statistics is to distinguish between the true parameter value of the standard deviation σ and the sample standard deviation s. But at the turn of the century statisticians did not. They called both σ and s the standard deviation. They always used such large samples that their estimate really did approximate the parameter value, so it did not make much difference to their results. But their methods would not do for experimental work. You cannot get samples of thousands of experimental points. …

[p.49] …, the main question was exactly how much wider should the error limits be to make allowance for the error introduced by using the estimates m and s instead of the parameters μ and σ. Pearson could not answer that question for Gosset in 1905, nor the one that followed, which was: what level of probability should be called significant?

[p.49] …, Gosset worked out the exact answer to his question about the probable error of the mean and tabulated the probability values of his criterion z=(m-μ)/s for samples of N=2,3,…,10. He tried also to calculate the distribution of the correlation coefficient by the same method but managed to get the answer only for the case when the true correlation is zero. …

]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/guinness-gosset-fisher-and-small-samples/feed/ 0
chi-square distribution [Eqn] http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-chisq-distribution/ http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-chisq-distribution/#comments Wed, 16 Jul 2008 17:00:16 +0000 vlk http://hea-www.harvard.edu/AstroStat/slog/?p=342 The Χ2 distribution plays an incredibly important role in astronomical data analysis, but it is pretty much a black box to most astronomers. How many people know, for instance, that its form is exactly the same as the γ distribution? A Χ2 distribution with ν degrees of freedom is

p(z|ν) = (1/Γ(ν/2)) (1/2)ν/2 zν/2-1 e-z/2 ≡ γ(z;ν/2,1/2) , where z=Χ2.

Its more familiar usage is in the cumulative form, which is just the incomplete gamma function. This is where you count off how much area is enclosed in [0,Χ2) to tell at what point the 68%, 95%, etc., thresholds are met. For example, for ν=1,

0Z dx p(Χ2|ν=1) = 0.68 when Z=1.

This is the origin of the ΔΧ2=1 method to determine error bars on best-fit parameters.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/eotw-chisq-distribution/feed/ 4
[ArXiv] 5th week, Jan. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-jan-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-jan-2008/#comments Fri, 01 Feb 2008 18:01:03 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-jan-2008/ Some statistics papers were listed at the top, of which topics would interest some slog subscribers.

From statistics arxiv:

  • [stat.CO:0801.3387] Contemplating Evidence: properties, extensions of, and alternatives to Nested Sampling N. Chopin &C. Robert
  • [math.ST:0801.4329] Estimators of Long-Memory: Fourier versus Wavelets G. Fay et.al. (not comprehensible but the title is more than interesting)

From astro-ph:

  • [astro-ph:0801.4041] Quantifying parameter errors due to the peculiar velocities of type Ia supernovae R. Ali Vanderveld
  • [astro-ph:0801.4233] Effects of the interaction between dark energy and dark matter on cosmological parameters J. He & B. Wang
  • [astro-ph:0801.4889] Temporal variability and statistics of the Strehl ratio in adaptive-optics images S. Gladysz
  • [astro-ph:0801.4751] Low-Luminosity Gamma-Ray Bursts as a Distinct GRB Population:A Monte Carlo Analysis F Virgili, E Liang, &B Zhang
  • [astro-ph:0801.4759] Optical afterglow luminosities in the Swift epoch: confirming clustering and bimodality M. Nardini, G. Ghisellini & G. Ghirlanda

(The last two papers mentioned Kolmogorov-Smirnov test and probability)

]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-5th-week-jan-2008/feed/ 0
[ArXiv] 4th week, Jan. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-jan-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-jan-2008/#comments Fri, 25 Jan 2008 16:37:12 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-jan-2008/ Only three papers this week. There were a few more with chi-square fitting and its error bars but excluded.

  • [astro-ph:0801.3346] Hipparcos distances of Ophiuchus and Lupus cloud complexes M. Lombardi, C. Lada, & J. Alves (likelihoods and MCMC were used)
  • [astro-ph:0801.3543] Results of the ROTOR-program. II. The long-term photometric variability of weak-line T Tauri stars K.N. Grankin et. al. (discusses periodogram)
  • [astro-ph:0801.3822] Estimating the Redshift Distribution of Faint Galaxy Samples M. Lima et.al.
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-jan-2008/feed/ 0
[ArXiv] Data-Driven Goodness-of-Fit Tests, Aug. 1, 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-data-driven-goodness-of-fit-tests/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-data-driven-goodness-of-fit-tests/#comments Fri, 17 Aug 2007 23:37:51 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-data-driven-goodness-of-fit-tests-aug-1-2007/ From arxiv/math.st:0708.0169v1
Data-Driven Goodness-of-Fit Tests by L. Mikhail

Goodness-of-Fit tests have been essential in astronomy to validate the chosen physical model to observed data whereas the limits of these tests have not been taken into consideration carefully when observed data were put into the model for estimating the model parameters. Therefore, I thought this paper would be helpful to have a thought on the different point of views between the astronomers’ practice of goodness-of-fit tests and the statisticians’ constructing tests. (Warning: the paper is abstract and theoretical.)

This paper began with presenting two approaches to constructing test statistics: 1. some measure of distance between the theoretical and empirical distributions like the Cramer-von Mises and the Komogorov-Smirnov statistics and 2. score test statistics, constructed in a way that the tests is asymptotically normal. As the second approach is preferred, the author confined his study to generalize the theory of score tests. The notion of the Neyman type (NT) test was introduced with very minimal assumptions to shape the statistics.

The author discussed the statistical inverse problems or the deconvolution problems of physics, seismology, optics, and imaging where noisy signals and measurements occur. These inverse problems induce the Neyman’s type statistics under appropriate regularity assumptions.

Other type of NT tests in terms of score functions and their consistency was presented in an abstract fashion.

]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-data-driven-goodness-of-fit-tests/feed/ 1