TY - JOUR
T1 - Bayesian photometric redshifts with empirical training sets
AU - Wolf, Christian
PY - 2009/7
Y1 - 2009/7
N2 - We combine in a single framework the two complementary benefits of χ2 template fits and empirical training sets used e.g. in neural nets: χ2 is more reliable when its probability density functions (PDFs) are inspected for multiple peaks, while empirical training is more accurate when calibration and priors of query data and training set match. We present a χ2 empirical method that derives PDFs from empirical models as a subclass of kernel regression methods, and apply it to the Sloan Digital Sky Survey Data Release 5 sample of >75 000 quasi-stellar objects, which is full of ambiguities. Objects with single-peak PDFs show <1 per cent outliers, rms redshift errors <0.05 and vanishing redshift bias. At z > 2.5, these figures are two times better. Outliers result purely from the discrete nature and limited size of the model, and rms errors are dominated by the intrinsic variety of object colours. PDFs classed as ambiguous provide accurate probabilities for alternative solutions and thus weights for using both solutions and avoiding needless outliers. E.g. the PDFs predict 78.0 per cent of the stronger peaks to be correct, which is true for 77.9 per cent of them. Redshift incompleteness is common in faint spectroscopic surveys and turns into a massive undetectable outlier risk above other performance limitations, but we can quantify residual outlier risks stemming from size and completeness of the model. We propose a matched χ2 error scale for noisy data and show that it produces correct error estimates and redshift distributions accurate within Poisson errors. Our method can easily be applied to future large galaxy surveys, which will benefit from the reliability in ambiguity detection and residual risk quantification.
AB - We combine in a single framework the two complementary benefits of χ2 template fits and empirical training sets used e.g. in neural nets: χ2 is more reliable when its probability density functions (PDFs) are inspected for multiple peaks, while empirical training is more accurate when calibration and priors of query data and training set match. We present a χ2 empirical method that derives PDFs from empirical models as a subclass of kernel regression methods, and apply it to the Sloan Digital Sky Survey Data Release 5 sample of >75 000 quasi-stellar objects, which is full of ambiguities. Objects with single-peak PDFs show <1 per cent outliers, rms redshift errors <0.05 and vanishing redshift bias. At z > 2.5, these figures are two times better. Outliers result purely from the discrete nature and limited size of the model, and rms errors are dominated by the intrinsic variety of object colours. PDFs classed as ambiguous provide accurate probabilities for alternative solutions and thus weights for using both solutions and avoiding needless outliers. E.g. the PDFs predict 78.0 per cent of the stronger peaks to be correct, which is true for 77.9 per cent of them. Redshift incompleteness is common in faint spectroscopic surveys and turns into a massive undetectable outlier risk above other performance limitations, but we can quantify residual outlier risks stemming from size and completeness of the model. We propose a matched χ2 error scale for noisy data and show that it produces correct error estimates and redshift distributions accurate within Poisson errors. Our method can easily be applied to future large galaxy surveys, which will benefit from the reliability in ambiguity detection and residual risk quantification.
KW - Methods: statistical
KW - Surveys
KW - Techniques: photometric
UR - http://www.scopus.com/inward/record.url?scp=67650410051&partnerID=8YFLogxK
U2 - 10.1111/j.1365-2966.2009.14953.x
DO - 10.1111/j.1365-2966.2009.14953.x
M3 - Article
SN - 0035-8711
VL - 397
SP - 520
EP - 533
JO - Monthly Notices of the Royal Astronomical Society
JF - Monthly Notices of the Royal Astronomical Society
IS - 1
ER -