Of Genes and Machines: Application of a Combination of Machine Learning Tools to Astronomy Data Sets

Kumar, S.; Chambers, K. C.; Flewelling, H.; Magnier, E. A.; Waters, C.; Metcalfe, N.; Burgett, W. S.; Kaiser, N.; Draper, P. W.; Heinis, S.; Gezari, S.

United States, United Kingdom

Abstract

We apply a combination of genetic algorithm (GA) and support vector machine (SVM) machine learning algorithms to solve two important problems faced by the astronomical community: star-galaxy separation and photometric redshift estimation of galaxies in survey catalogs. We use the GA to select the relevant features in the first step, followed by optimization of SVM parameters in the second step to obtain an optimal set of parameters to classify or regress, in the process of which we avoid overfitting. We apply our method to star-galaxy separation in Pan-STARRS1 data. We show that our method correctly classifies 98% of objects down to {I}{{P1}}=24.5, with a completeness (or true positive rate) of 99% for galaxies and 88% for stars. By combining colors with morphology, our star-galaxy separation method yields better results than the new SExtractor classifier spread_model, in particular at the faint end ({I}{{P1}}\gt 22). We also use our method to derive photometric redshifts for galaxies in the COSMOS bright multiwavelength data set down to an error in (1+z) of σ =0.013, which compares well with estimates from spectral energy distribution fitting on the same data (σ =0.007) while making a significantly smaller number of assumptions.

2016 The Astrophysical Journal
eHST 12