We present a system for 3D reconstruction of large-scale outdoor scenes based on monocular motion stereo. Ours is the first such system to run at interactive frame rates on a mobile device (Google Project Tango Tablet), thus allowing a user to reconstruct scenes "on the go" by simply walking around them. We utilize the device's GPU to compute depth maps using plane sweep stereo. We then fuse the depth maps into a global model of the environment represented as a truncated signed distance function in a spatially hashed voxel grid. We observe that in contrast to reconstructing objects in a small volume of interest, or using the near outlier-free data provided by depth sensors, one can rely less on free-space measurements for suppressing outliers in unbounded large-scale scenes. Consequently, we propose a set of simple filtering operations to remove unreliable depth estimates and experimentally demonstrate the benefit of strongly filtering depth maps. We extensively evaluate the system with real as well as synthetic datasets.
Large-Scale Outdoor 3D Reconstruction on a Mobile Device (Computer Vision and Image Understanding 2016) [bib]