TY - GEN
T1 - Single image depth estimation from predicted semantic labels
AU - Liu, Beyang
AU - Gould, Stephen
AU - Koller, Daphne
PY - 2010
Y1 - 2010
N2 - We consider the problem of estimating the depth of each pixel in a scene from a single monocular image. Unlike traditional approaches [18, 19], which attempt to map from appearance features to depth directly, we first perform a semantic segmentation of the scene and use the semantic labels to guide the 3D reconstruction. This approach provides several advantages: By knowing the semantic class of a pixel or region, depth and geometry constraints can be easily enforced (e.g., "sky" is far away and "ground" is horizontal). In addition, depth can be more readily predicted by measuring the difference in appearance with respect to a given semantic class. For example, a tree will have more uniform appearance in the distance than it does close up. Finally, the incorporation of semantic features allows us to achieve state-of-the-art results with a significantly simpler model than previous works.
AB - We consider the problem of estimating the depth of each pixel in a scene from a single monocular image. Unlike traditional approaches [18, 19], which attempt to map from appearance features to depth directly, we first perform a semantic segmentation of the scene and use the semantic labels to guide the 3D reconstruction. This approach provides several advantages: By knowing the semantic class of a pixel or region, depth and geometry constraints can be easily enforced (e.g., "sky" is far away and "ground" is horizontal). In addition, depth can be more readily predicted by measuring the difference in appearance with respect to a given semantic class. For example, a tree will have more uniform appearance in the distance than it does close up. Finally, the incorporation of semantic features allows us to achieve state-of-the-art results with a significantly simpler model than previous works.
UR - http://www.scopus.com/inward/record.url?scp=77956004112&partnerID=8YFLogxK
U2 - 10.1109/CVPR.2010.5539823
DO - 10.1109/CVPR.2010.5539823
M3 - Conference contribution
SN - 9781424469840
T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
SP - 1253
EP - 1260
BT - 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
T2 - 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2010
Y2 - 13 June 2010 through 18 June 2010
ER -