The AstroStat Slog » Sun

SDO launched

vlk — Thu, 11 Feb 2010 19:04:00 +0000

The Solar Dynamics Observatory, which promises a flood of data on the Sun, was launched today from Cape Kennedy.

space weather

hlee — Thu, 21 May 2009 22:55:26 +0000

Among billion objects in our Galaxy, outside the Earth, our Sun drags most attention from astronomers. These astronomers go by solar physicists, who enjoy the most abundant data including 400 year long sunspot counts. Their joy is not only originated from the fascinating, active, and unpredictable characteristics of the Sun but also attributed to its influence on our daily lives. Related to the latter, sometimes studying the conditions on the Sun is called space weather forecast.

With my limited knowledge, I cannot lay out all important aspects in solar physics, climate changes (not limited to our lower atmosphere but covering the space between the sun and the earth) due to solar activities, and the most important issues of recent years related to space weather. Only I can emphasize that compared to earth climate/atmosphere or meteorology, contribution from statisticians to space weather is almost none existing. I’ve witnessed frequently that crude eyeballing instead of statistics in analyzing data and quantifying images occurs in Solar Physics. Luckily, a few articles discussing statistics are found and my discussion is rather focused on these papers while leaving a room for solar physicists to chip in how space weather is dealt statistically for collaborating with statisticians.

By the way, I have no intention of degrading “eyeballing” in data analysis by astronomers. Statistical methods under EDA, exploratory data analysis whose counterpart is CDA, confirmatory data analysis, or statistical inference, is basically “eyeballing” with technical jargon and basics from probability theory. EDA is important to doubt every step in astronomers’ chi-square methods. Without those diagnostics and visualization, choosing right statistical strategies is almost impossible with real data sets. I used “crude” because instead of using “edge detection” algorithms, edges are drawn by hand via eyeballing. Also, my another disclaimer is that there are brilliant image processing/computer vision strategies developed by astronomers, which I’m not going to present. I’m focusing on small areas in statistics related to space weather and its forecasting.

Statistical Assessment of Photospheric Magnetic Features in Imminent Solar Flare Predictions by Song et al. (2009) SoPh. v. 254, p.101.

Their forte is “logistic regression” a statistical model that is not often used in astronomy. It is seen when modeling binary responses (or categorical responses like head or tail; agree, neutral, or disgree) and bunch of predictors, i.e. classification with multiple features or variables (astronomers might like to replace these lexicons with parameters). Also, the issue of variable selection is discussed like L_{gnl} to be the most powerful predictor. Their training set was carefully discussed from the solar physical perspective. Against their claim that they used “logistic regression” to predict solar flares for the first time, there was another paper a few years back discussing “logistic regression” to predict geomagnetic storms or coronal mass ejections. This statement can be wrong if flares and CMEs are exclusive events.

The Challenge of Predicting the Occurrence of Intense Storms by Srivastava (2006) J.Astrophys. Astr. v.27, pp.237-242

Probability of the storm occurrence is response in logistic regression model, of which predictors are CME related variables including latitude and longitude of the origin of CME, and interplanetary inputs like shock speeds, ram pressure, and solar wind related measures. Cross-validation was performed. A comment that the initial speed of a CME might be the most reliable predictor is given but no extensive discussion of variable selection/model selection.

Personally speaking, both publications^[1] can be more statistically rigorous to discuss various challenges in logistic regression from the statistical learning/classification perspective and from the model/variable selection aspect to define more well behaving and statistically rigorous classifiers.

Often times we plan our days according to the weather forecast (although we grumble weather forecasts are not right, almost everyone relies on numbers and predictions from weather people). Although it may not be 100% reliable, those forecasts make our lives easier. Also, more reliable models are under developing. On the other hand, forecasting space weather with the help of statistics is yet unthinkable. However, scientists and engineers understand that the reliable space weather models help planning space missions and controlling satellites into safety mode. At least I know is that with the presence of flare or CME forecasting models, fewer scientists/engineers need to wake up in the middle of night, because of, otherwise unforeseen storms from the sun.

I thought I collected more papers under “statistics” and “space weather,” not just these two. A few more probably are buried somewhere. It’s hard to believe such rich field is not touched by statisticians. I’d appreciate very much your kind forwarding those relevant papers. I’ll gradually add them.

An excerpt from …

hlee — Thu, 26 Feb 2009 20:07:13 +0000

I’ve been complaining about how one can do machine learning on solar images without a training set? (see my comment at the big picture). On the other hand, I’m also aware of challenges in astronomy that data (images) cannot be transformed freely and be fed into standard machine learning algorithms. Tailoring data pipelining, cleaning, and processing to currently existing vision algorithms may not be achievable. The hope of automatizing the detection/identification procedure of interesting features (e.g. flares and loops) and forecasting events on the surface of the Sun is only a dream. Even though the level of image data stream is that of tsunami, we might have to depend on human eyes to comb out interesting features on the Sun until the new paradigm of automatized feature identification algorithms based on a single image i.e. without a training set. The good news is that human eyes have done a superb job!

From A Survey of the Statistical Theory of Shape by David G. Kendall, Statistical Science, Vol. 4, No. 2 (May, 1989), pp. 87-99.
It is well known that no classical test for two dimensional stochastic point processes can match the performance of the human eye and brain in detecting the presence of improbably large holes in the realized pattern of points. This fact has generated a great deal of research in the last few years, especially in connection with the large “voids” and long “strings” that the eye sees (or declares that it sees) in maps of the Shane and Wirtanen catalogue of positions of galaxies. Astronomers are interested in (i)whether these phenomena are sufficiently extreme to require explanation, and if so (ii) whether any of the various “model” universes now in vsgue can be said to display them to just the same degree.

The Big Picture

vlk — Mon, 13 Oct 2008 17:07:03 +0000

Our hometown rag (the Boston Globe) runs an occasional series of photo collections that highlight news stories called The Big Picture. This week, they take a look at the Sun: http://www.boston.com/bigpicture/2008/10/the_sun.html

The pictures come from space and ground observatories, from SoHO, TRACE, Hinode, STEREO, etc. Goes without saying, the images are stunning, and some are even animated. The real kicker is that images such as these are being acquired by the hundreds, every hour upon the hour, 24/7/365.25 . It is like sipping from a firehose. Nobody can sit there and look at them all, so who knows what we are missing out on. Can statistics help? Can we automate a statistically robust “interestingness” criterion to filter the data stream that humans can then follow up on?

[ArXiv] 2nd week, June 2008

hlee — Mon, 16 Jun 2008 14:47:42 +0000

As Prof. Speed said, PCA is prevalent in astronomy, particularly this week. Furthermore, a paper explicitly discusses R, a popular statistics package.

[astro-ph:0806.1140] N.Bonhomme, H.M.Courtois, R.B.Tully
Derivation of Distances with the Tully-Fisher Relation: The Antlia Cluster
(Tully Fisher relation is well known and one of many occasions statistics could help. On the contrary, astronomical biases as well as measurement errors hinder from the collaboration).
[astro-ph:0806.1222] S. Dye
Star formation histories from multi-band photometry: A new approach (Bayesian evidence)
[astro-ph:0806.1232] M. Cara and M. Lister
Avoiding spurious breaks in binned luminosity functions
(I think that binning is not always necessary and overdosed, while there are alternatives.)
[astro-ph:0806.1326] J.C. Ramirez Velez, A. Lopez Ariste and M. Semel
Strength distribution of solar magnetic fields in photospheric quiet Sun regions (PCA was utilized)
[astro-ph:0806.1487] M.D.Schneider et al.
Simulations and cosmological inference: A statistical model for power spectra means and covariances
(They used R and its package Latin hypercube samples, lhs.)
[astro-ph:0806.1558] Ivan L. Andronov et al.
Idling Magnetic White Dwarf in the Synchronizing Polar BY Cam. The Noah-2 Project (PCA is applied)
[astro-ph:0806.1880] R. G. Arendt et al.
Comparison of 3.6 – 8.0 Micron Spitzer/IRAC Galactic Center Survey Point Sources with Chandra X-Ray Point Sources in the Central 40×40 Parsecs (K-S test)