[ArXiv] Decision Tree, Aug. 31, 2007

From arxiv/astro-ph:0708.4274v1
Comparison of decision tree methods for finding active objects by Y. Zhao and Y. Zhang

The authors (astronomers) introduced and summarized various decision three methods (REPTree, Random Tree, Decision Stump, Random Forest, J48, NBTree, and AdTree) to the astronomical community.

The goal of applying decision tree methods is discriminating active objects (quasars, BL Lac objects, and active galaxies) from non-active objects (stars and galaxies) and overcoming drawbacks of popular neural networks (NNs) and support vector machine (SVM) in the astronomical society thanks to the following properties: non-parametric modeling (does not require strong model assumptions), identifiability of important independent variables, and relatively short training period suitable for huge data sets. Shortcomings of decision tree methods were also described. Nonetheless, the fact that the decision tree provides clear and easy interpretable classification rules is hardly ignorable.

Performances of the listed methods were compared based on their accuracy and computing time. For separating AGNs from non-AGNs, AdTree performed better in terms of accuracy and Decision Stump produced the fastest result. The software for this study was WEKA (The Waikato Environment for Knowledge Analysis), which is available from this link.

One Comment
  1. hlee:

    Fast growth rate of data archives challenges the pattern of data mining in astronomy. Classification methods that could be adapted for sequential analysis in addition to short training period are highly recommended. Yet, accuracy and interpretability should not be sacrificed.

    09-04-2007, 11:07 pm
Leave a comment