Multi-scale Iterative Residuals for Fast and Scalable Stereo Matching
Despite the remarkable progress of deep learning in stereo matching, there exists a gap in accuracy between real-time models and slower state-of-the-art models which are suitable for practical applications. This paper presents an iterative multi-scale coarse-to-fine refinement (iCFR) framework to bridge this gap by allowing it to adopt any stereo matching network to make it fast, more efficient and scalable while keeping comparable accuracy. To reduce the computational cost of matching, we use multi-scale warped features to estimate disparity residuals and push the disparity search range in the cost volume to a minimum limit. Finally, we apply a refinement network to recover the loss of precision which is inherent in multi-scale approaches. We test our iCFR framework by adopting the matching networks from state-of-the art GANet and AANet. The result is 49× faster inference time compared to GANetdeep and 4× less memory consumption, with comparable error. Our best performing network, which we call FRSNet is scalable even up to an input resolution of 6K on a GTX 1080Ti, with inference time still below one second and comparable accuracy to AANet+. It out-performs all real-time stereo methods and achieves competitive accuracy on the KITTI benchmark.
READ FULL TEXT