《Single Image Depth Estimation》.pdf
文本预览下载声明
Single Image Depth Estimation From Predicted Semantic Labels
Beyang Liu Stephen Gould Daphne Koller
Dept. of Computer Science Dept. of Electrical Engineering Dept. of Computer Science
Stanford University Stanford University Stanford University
beyangl@cs.stanford.edu sgould@stanford.edu koller@cs.stanford.edu
Abstract
We consider the problem of estimating the depth of each
pixel in a scene from a single monocular image. Unlike tra-
ditional approaches [18, 19], which attempt to map from
appearance features to depth directly, we first perform a
semantic segmentation of the scene and use the semantic
labels to guide the 3D reconstruction. This approach pro-
vides several advantages: By knowing the semantic class Figure 1. Example output from our model showing how semantic
of a pixel or region, depth and geometry constraints can class prediction (center) strongly informs depth perception (right).
be easily enforced (e.g., “sky” is far away and “ground” Semantic classes are shown overlayed on image. Depth indicated
is horizontal). In addition, depth can be more readily pre- by colormap (red is more distant). See Figure 6 for color legend.
dicted by measuring the difference in appearance with re-
spect to a given semantic class. For example, a tree will
have more uniform appearance in the distance than it does mated 3D scene reconstruction [19, 12, 4, 11, 18] has fo-
close up. Finally, the incorporation of semantic features cuses on extracting these geometric cues and additional in-
allows us to achieve state-of-the-art results with a signifi- f
显示全部