TY - JOUR
T1 - Estimation of stellar atmospheric parameters from SDSS/SEGUE spectra
AU - Re Fiorentin, P.
AU - Bailer-Jones, C. A.L.
AU - Lee, Y. S.
AU - Beers, T. C.
AU - Sivarani, T.
AU - Wilhelm, R.
AU - Prieto, C. Allende
AU - Norris, J. E.
PY - 2007/6
Y1 - 2007/6
N2 - We present techniques for the estimation of stellar atmospheric parameters (Teff, log g, [Fe/H]) for stars from the SDSS/SEGUE survey. The atmospheric parameters are derived from the observed medium-resolution (R = 2000) stellar spectra using non-linear regression models trained either on (1) pre-classified observed data or (2) synthetic stellar spectra. In the first case we use our models to automate and generalize parametrization produced by a preliminary version of the SDSS/SEGUE Spectroscopic Parameter Pipeline (SSPP). In the second case we directly model the mapping between synthetic spectra (derived from Kurucz model atmospheres) and the atmospheric parameters, independently of any intermediate estimates. After training, we apply our models to various samples of SDSS spectra to derive atmospheric parameters, and compare our results with those obtained previously by the SSPP for the same samples. We obtain consistency between the two approaches, with RMS deviations on the order of 150 K in Teff, 0.35 dex in log g, and 0.22 dex in [Fe/H]. The models are applied to pre-processed spectra, either via Principal Component Analysis (PCA) or a Wavelength Range Selection (WRS) method, which employs a subset of the full 3850-9000 Å spectral range. This is both for computational reasons (robustness and speed), and because it delivers higher accuracy (better generalization of what the models have learned). Broadly speaking, the PCA is demonstrated to deliver more accurate atmospheric parameters when the training data are the actual SDSS spectra with previously estimated parameters, whereas WRS appears superior for the estimation of log g via synthetic templates, especially for lower signal-to-noise spectra. From a subsample of some 19000 stars with previous determinations of the atmospheric parameters, the accuracies of our predictions (mean absolute errors) for each parameter are Teff to 170/170 K, log g to 0.36/0.45 dex, and [Fe/H] to 0.19/0.26 dex, for methods (1) and (2), respectively. We measure the intrinsic errors of our models by training on synthetic spectra and evaluating their performance on an independent set of synthetic spectra. This yields RMS accuracies of 50 K, 0.02 dex, and 0.03 dex on Teff, log g, and [Fe/H], respectively. Our approach can be readily deployed in an automated analysis pipeline, and can easily be retrained as improved stellar models and synthetic spectra become available. We nonetheless emphasise that this approach relies on an accurate calibration and pre-processing of the data (to minimize mismatch between the real and synthetic data), as well as sensible choices concerning feature selection. From an analysis of cluster candidates with available SDSS spectroscopy (M 15, M 13, M 2, and NGC 2420), and assuming the age, metallicity, and distances given in the literature are correct, we find evidence for small systematic offsets in Teff and/or log g for the parameter estimates from the model trained on real data with the SSPP. Thus, this model turns out to derive more precise, but less accurate, atmospheric parameters than the model trained on synthetic data.
AB - We present techniques for the estimation of stellar atmospheric parameters (Teff, log g, [Fe/H]) for stars from the SDSS/SEGUE survey. The atmospheric parameters are derived from the observed medium-resolution (R = 2000) stellar spectra using non-linear regression models trained either on (1) pre-classified observed data or (2) synthetic stellar spectra. In the first case we use our models to automate and generalize parametrization produced by a preliminary version of the SDSS/SEGUE Spectroscopic Parameter Pipeline (SSPP). In the second case we directly model the mapping between synthetic spectra (derived from Kurucz model atmospheres) and the atmospheric parameters, independently of any intermediate estimates. After training, we apply our models to various samples of SDSS spectra to derive atmospheric parameters, and compare our results with those obtained previously by the SSPP for the same samples. We obtain consistency between the two approaches, with RMS deviations on the order of 150 K in Teff, 0.35 dex in log g, and 0.22 dex in [Fe/H]. The models are applied to pre-processed spectra, either via Principal Component Analysis (PCA) or a Wavelength Range Selection (WRS) method, which employs a subset of the full 3850-9000 Å spectral range. This is both for computational reasons (robustness and speed), and because it delivers higher accuracy (better generalization of what the models have learned). Broadly speaking, the PCA is demonstrated to deliver more accurate atmospheric parameters when the training data are the actual SDSS spectra with previously estimated parameters, whereas WRS appears superior for the estimation of log g via synthetic templates, especially for lower signal-to-noise spectra. From a subsample of some 19000 stars with previous determinations of the atmospheric parameters, the accuracies of our predictions (mean absolute errors) for each parameter are Teff to 170/170 K, log g to 0.36/0.45 dex, and [Fe/H] to 0.19/0.26 dex, for methods (1) and (2), respectively. We measure the intrinsic errors of our models by training on synthetic spectra and evaluating their performance on an independent set of synthetic spectra. This yields RMS accuracies of 50 K, 0.02 dex, and 0.03 dex on Teff, log g, and [Fe/H], respectively. Our approach can be readily deployed in an automated analysis pipeline, and can easily be retrained as improved stellar models and synthetic spectra become available. We nonetheless emphasise that this approach relies on an accurate calibration and pre-processing of the data (to minimize mismatch between the real and synthetic data), as well as sensible choices concerning feature selection. From an analysis of cluster candidates with available SDSS spectroscopy (M 15, M 13, M 2, and NGC 2420), and assuming the age, metallicity, and distances given in the literature are correct, we find evidence for small systematic offsets in Teff and/or log g for the parameter estimates from the model trained on real data with the SSPP. Thus, this model turns out to derive more precise, but less accurate, atmospheric parameters than the model trained on synthetic data.
KW - Methods: data analysis
KW - Methods: statistical
KW - Stars: fundamental parameters
KW - Surveys
UR - http://www.scopus.com/inward/record.url?scp=34249938937&partnerID=8YFLogxK
U2 - 10.1051/0004-6361:20077334
DO - 10.1051/0004-6361:20077334
M3 - Article
SN - 0004-6361
VL - 467
SP - 1373
EP - 1387
JO - Astronomy and Astrophysics
JF - Astronomy and Astrophysics
IS - 3
ER -