The AstroStat Slog » sunspot http://hea-www.harvard.edu/AstroStat/slog Weaving together Astronomy+Statistics+Computer Science+Engineering+Intrumentation, far beyond the growing borders Fri, 09 Sep 2011 17:05:33 +0000 en-US hourly 1 http://wordpress.org/?v=3.4 More on Space Weather http://hea-www.harvard.edu/AstroStat/slog/2009/more-on-space-weather/ http://hea-www.harvard.edu/AstroStat/slog/2009/more-on-space-weather/#comments Tue, 22 Sep 2009 17:03:11 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=3319 Thanks to a Korean solar physicist[1] I was able to gather the following websites and some relevant information on Space Weather Forecast in action, not limited to literature nor toy data.


These seem quite informative and I believe more statisticians and data scientists (signal and image processing, machine learning, computer vision, and data mining) easily collaborate with solar physicists. All the complexity, as a matter of fact, comes from data processing to be fed in to (machine, statistical) learning algorithms and defining the objectives of learning. Once settled, one can easily apply numerous methods in the field to these time varying solar images.

I’m writing this short posting because I finally found those interesting articles that I collected for my previous post on Space Weather. After finding them and scanning through, I realized that methodology-wise they only made baby steps. You’ll see a limited number key words are repeated although there is a humongous society of scientists and engineers in the knowledge discovery and data mining.

Note that the objectives of these studies are quite similar. They described machine learning for the purpose of automatizing the procedure of detecting features of interest of the Sun and possible forecasting relevant phenomena that affects our own atmosphere due to associated solar activities.

  1. Automated Prediction of CMEs Using Machine Learning of CME – Flare Associations by Qahwaji et al. (2008) in Solar Phy. vol 248, pp.471-483.
  2. Automatic Short-Term Solar Flare Prediction using Machine Learning and Sunspot Associations by Qahwaji and Colak (2007) in Solar Phy. vol. 241, pp. 195-211

    Space weather is defined by the U.S. National Space Weather Probram (NSWP) as “conditions on the Sun and in the solar wind, magnetosphere, ionosphere, and thermosphere that can influence the performance and reliability of space-borne and ground-based technological systems and can endanger human life or health”

    Personally thinking, the section of “jackknife” needs to be replaced with “cross-validation.”

  3. Automatic Detection and Classification of Coronal Mass Ejections by Qu et al. (2006) in Solar Phy. vol. 237, pp.419-431.
  4. Automatic Solar Filament Detection Using image Processing Techniques by Qu et al. (2005) in Solar Phy., vol. 228, pp. 119-135
  5. Automatic Solar Flare Tracking Using Image-Processing Techniques by Qu, et al. (2004) in Solar Phy. vol. 222, pp. 137-149
  6. Automatic Solar Flare Detection Using MLP, RBF, and SVM by Qu et al. (2003) in Solar Phy. vol. 217, pp.157-172. pp. 157-172

I’d like add a survey paper on another type of learning methods beyond Support Vector Machine (SVM) used in almost all articles above. Luckily, this survey paper happened to address my concern about the “practices of background subtraction” in high energy astrophysics.

A Survey of Manifold-Based Learning methods by Huo, Ni, Smith
[Excerpt] What is Manifold-Based Learning?
It is an emerging and promising approach in nonparametric dimension reduction. The article reviewed principle component analysis, multidimensional scaling (MDS), generative topological mapping (GTM), locally linear embedding (LLE), ISOMAP, Laplacian eigenmaps, Hessian eigenmaps, and local tangent space alignment (LTSA) Apart from these revisits and comparison, this survey paper is useful to understand the danger of background subtraction. Homogeneity does not mean constant background to be subtracted, often cause negative source observation.

More collaborations among multiple disciplines are desired in this relatively new field. For me, it is one of the best data and information scientific fields of the 21st century and any progress will be beneficial to human kind.

  1. I must acknowledge him for his kindness and patience. He was my wikipedia to questions while I was studying the Sun.
]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/more-on-space-weather/feed/ 0
space weather http://hea-www.harvard.edu/AstroStat/slog/2009/space-weather/ http://hea-www.harvard.edu/AstroStat/slog/2009/space-weather/#comments Thu, 21 May 2009 22:55:26 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=2413 Among billion objects in our Galaxy, outside the Earth, our Sun drags most attention from astronomers. These astronomers go by solar physicists, who enjoy the most abundant data including 400 year long sunspot counts. Their joy is not only originated from the fascinating, active, and unpredictable characteristics of the Sun but also attributed to its influence on our daily lives. Related to the latter, sometimes studying the conditions on the Sun is called space weather forecast.

With my limited knowledge, I cannot lay out all important aspects in solar physics, climate changes (not limited to our lower atmosphere but covering the space between the sun and the earth) due to solar activities, and the most important issues of recent years related to space weather. Only I can emphasize that compared to earth climate/atmosphere or meteorology, contribution from statisticians to space weather is almost none existing. I’ve witnessed frequently that crude eyeballing instead of statistics in analyzing data and quantifying images occurs in Solar Physics. Luckily, a few articles discussing statistics are found and my discussion is rather focused on these papers while leaving a room for solar physicists to chip in how space weather is dealt statistically for collaborating with statisticians.

By the way, I have no intention of degrading “eyeballing” in data analysis by astronomers. Statistical methods under EDA, exploratory data analysis whose counterpart is CDA, confirmatory data analysis, or statistical inference, is basically “eyeballing” with technical jargon and basics from probability theory. EDA is important to doubt every step in astronomers’ chi-square methods. Without those diagnostics and visualization, choosing right statistical strategies is almost impossible with real data sets. I used “crude” because instead of using “edge detection” algorithms, edges are drawn by hand via eyeballing. Also, my another disclaimer is that there are brilliant image processing/computer vision strategies developed by astronomers, which I’m not going to present. I’m focusing on small areas in statistics related to space weather and its forecasting.

Statistical Assessment of Photospheric Magnetic Features in Imminent Solar Flare Predictions by Song et al. (2009) SoPh. v. 254, p.101.

Their forte is “logistic regression” a statistical model that is not often used in astronomy. It is seen when modeling binary responses (or categorical responses like head or tail; agree, neutral, or disgree) and bunch of predictors, i.e. classification with multiple features or variables (astronomers might like to replace these lexicons with parameters). Also, the issue of variable selection is discussed like L_{gnl} to be the most powerful predictor. Their training set was carefully discussed from the solar physical perspective. Against their claim that they used “logistic regression” to predict solar flares for the first time, there was another paper a few years back discussing “logistic regression” to predict geomagnetic storms or coronal mass ejections. This statement can be wrong if flares and CMEs are exclusive events.

The Challenge of Predicting the Occurrence of Intense Storms by Srivastava (2006) J.Astrophys. Astr. v.27, pp.237-242

Probability of the storm occurrence is response in logistic regression model, of which predictors are CME related variables including latitude and longitude of the origin of CME, and interplanetary inputs like shock speeds, ram pressure, and solar wind related measures. Cross-validation was performed. A comment that the initial speed of a CME might be the most reliable predictor is given but no extensive discussion of variable selection/model selection.

Personally speaking, both publications[1] can be more statistically rigorous to discuss various challenges in logistic regression from the statistical learning/classification perspective and from the model/variable selection aspect to define more well behaving and statistically rigorous classifiers.

Often times we plan our days according to the weather forecast (although we grumble weather forecasts are not right, almost everyone relies on numbers and predictions from weather people). Although it may not be 100% reliable, those forecasts make our lives easier. Also, more reliable models are under developing. On the other hand, forecasting space weather with the help of statistics is yet unthinkable. However, scientists and engineers understand that the reliable space weather models help planning space missions and controlling satellites into safety mode. At least I know is that with the presence of flare or CME forecasting models, fewer scientists/engineers need to wake up in the middle of night, because of, otherwise unforeseen storms from the sun.

  1. I thought I collected more papers under “statistics” and “space weather,” not just these two. A few more probably are buried somewhere. It’s hard to believe such rich field is not touched by statisticians. I’d appreciate very much your kind forwarding those relevant papers. I’ll gradually add them.
]]>
http://hea-www.harvard.edu/AstroStat/slog/2009/space-weather/feed/ 0
[ArXiv] 4th week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/#comments Sun, 27 Apr 2008 15:29:48 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=276 The last paper in the list discusses MCMC for time series analysis, applied to sunspot data. There are six additional papers about statistics and data analysis from the week.

  • [astro-ph:0804.2904]M. Cruz et al.
    The CMB cold spot: texture, cluster or void?

  • [astro-ph:0804.2917] Z. Zhu, M. Sereno
    Testing the DGP model with gravitational lensing statistics

  • [astro-ph:0804.3390] Valkenburg, Krauss, & Hamann
    Effects of Prior Assumptions on Bayesian Estimates of Inflation Parameters, and the expected Gravitational Waves Signal from Inflation

  • [astro-ph:0804.3413] N.Ball et al.
    Robust Machine Learning Applied to Astronomical Datasets III: Probabilistic Photometric Redshifts for Galaxies and Quasars in the SDSS and GALEX (Another related publication [astro-ph:0804.3417])

  • [astro-ph:0804.3471] M. Cirasuolo et al.
    A new measurement of the evolving near-infrared galaxy luminosity function out to z~4: a continuing challenge to theoretical models of galaxy formation

  • [astro-ph:0804.3475] A.D. Mackey et al.
    Multiple stellar populations in three rich Large Magellanic Cloud star clusters

  • [stat.ME:0804.3853] C. R\”over , R. Meyer, N. Christensen
    Modelling coloured noise (MCMC & sunspot data)
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-4th-week-apr-2008/feed/ 0
[ArXiv] 3rd week, Apr. 2008 http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/ http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/#comments Mon, 21 Apr 2008 01:05:55 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/?p=269 The dichotomy of outliers; detecting outliers to be discarded or to be investigated; statistics that is robust enough not to be influenced by outliers or sensitive enough to alert the anomaly in the data distribution. Although not related, one paper about outliers made me to dwell on what outliers are. This week topics are diverse.

  • [astro-ph:0804.1809] H. Khiabanian, I.P. Dell’Antonio
    A Multi-Resolution Weak Lensing Mass Reconstruction Method (Maximum likelihood approach; my naive eyes sensed a certain degree of relationship to the GREAT08 CHALLENGE)

  • [astro-ph:0804.1909] A. Leccardi and S. Molendi
    Radial temperature profiles for a large sample of galaxy clusters observed with XMM-Newton

  • [astro-ph:0804.1964] C. Young & P. Gallagher
    Multiscale Edge Detection in the Corona

  • [astro-ph:0804.2387] C. Destri, H. J. de Vega, N. G. Sanchez
    The CMB Quadrupole depression produced by early fast-roll inflation: MCMC analysis of WMAP and SDSS data

  • [astro-ph:0804.2437] P. Bielewicz, A. Riazuelo
    The study of topology of the universe using multipole vectors

  • [astro-ph:0804.2494] S. Bhattacharya, A. Kosowsky
    Systematic Errors in Sunyaev-Zeldovich Surveys of Galaxy Cluster Velocities

  • [astro-ph:0804.2631] M. J. Mortonson, W. Hu
    Reionization constraints from five-year WMAP data

  • [astro-ph:0804.2645] R. Stompor et al.
    Maximum Likelihood algorithm for parametric component separation in CMB experiments (separate section for calibration errors)

  • [astro-ph:0804.2671] Peeples, Pogge, and Stanek
    Outliers from the Mass–Metallicity Relation I: A Sample of Metal-Rich Dwarf Galaxies from SDSS

  • [astro-ph:0804.2716] H. Moradi, P.S. Cally
    Time-Distance Modelling In A Simulated Sunspot Atmosphere (discusses systematic uncertainty)

  • [astro-ph:0804.2761] S. Iguchi, T. Okuda
    The FFX Correlator

  • [astro-ph:0804.2742] M Bazarghan
    Automated Classification of ELODIE Stellar Spectral Library Using Probabilistic Artificial Neural Networks

  • [astro-ph:0804.2827]S.H. Suyu et al.
    Dissecting the Gravitational Lens B1608+656: Lens Potential Reconstruction (Bayesian)
]]>
http://hea-www.harvard.edu/AstroStat/slog/2008/arxiv-3rd-week-apr-2008/feed/ 0
[ArXiv] 4th week, Nov. 2007 http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-4th-week-nov-2007/ http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-4th-week-nov-2007/#comments Sat, 24 Nov 2007 13:26:40 +0000 hlee http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-4th-week-nov-2007/ A piece of thought during my stay in Korea: As not many statisticians are interested in modern astronomy while they look for data driven problems, not many astronomers are learning up to date statistics while they borrow statistics in their data analysis. The frequency is quite low in astronomers citing statistical journals as little as statisticians introducing astronomical data driven problems. I wonder how other fields lowered such barriers decades ago.

No matter what, there are preprints from this week that may help to shrink the chasm.

  • [stat.ME:0711.3236]
    Confidence intervals in regression utilizing prior information P. Kabaila and K. Giri
  • [stat.ME:0711.3271]
    Computer model validation with functional output M. J. Bayarri, et. al.
  • [astro-ph:0711.3266]
    Umbral Fine Structures in Sunspots Observed with Hinode Solar Optical Telescope R. Kitai, et.al.
  • [astro-ph:0711.2720]
    Magnification Probability Distribution Functions of Standard Candles in a Clumpy Universe C. Yoo et.al.
  • [astro-ph:0711.3196]
    Upper Limits from HESS AGN Observations in 2005-2007 HESS Collaboration: F. Aharonian, et al
  • [astro-ph:0711.2509]
    Shrinkage Estimation of the Power Spectrum Covariance Matrix A. C. Pope and I. Szapudi
  • [astro-ph:0711.2631]
    Statistical properties of extragalactic sources in the New Extragalactic WMAP Point Source (NEWPS) catalogue J. González-Nuevo, et. al.
]]>
http://hea-www.harvard.edu/AstroStat/slog/2007/arxiv-4th-week-nov-2007/feed/ 0