[ArXiv] Kernel Regression, June 20, 2007

One of the papers from arxiv/astro-ph discusses kernel regression and model selection to determine photometric redshifts astro-ph/0706.2704. This paper presents their studies on choosing bandwidth of kernels via 10 fold cross-validation, choosing appropriate models from various combination of input parameters through estimating root mean square error and AIC, and evaluating their kernel regression to other regression and classification methods with root mean square errors from literature survey. They made a conclusion of flexibility in kernel regression particularly for data at high z.

Off the topic but worth to be notified:
1. They used AIC for model comparison. In spite of many advocates for BIC, choosing AIC would do a better job for analyzing catalog data (399,929 galaxies) since the penalty term in BIC with huge sample will lead to select the model of most parsimony.

2. Despite that more detailed discussion hasn’t been posted, I’d like to point out photometric redshift studies are more or less regression problems. Whether they use sophisticated and up-to-date classification schemes such as support vector machine (SVM), artificial neural network (ANN), or classical regression methods, the goal of the study in photometric redshifts is finding predictors for right classification and the model from those predictors. I wish there will be some studies on quantile regression, which receive many spotlights recently in economics.

3. Adaptive kernels were mentioned and the results of adaptive kernel regression are highly expected.

4. Comparing root mean square errors from various classification and regression models based on Sloan Digital Sky Survey (SDSS) EDR (Early Data Release) to DR5 (Date Release 5) might mislead the conclusion of choosing the best regression/classification method due to different sample sizes in EDR to DR5. Further formulation, especially asymptotic properties of these root mean square errors will be very useful to make a legitimate comparison among different regression/classification strategies.

Leave a comment