This paper presents a self-supervised method that can be trained on videos without known depth, which makes training data collection simple and improves the generalization of the learned network. The self-supervised learning is achieved by minimizing a photo-consistency loss between a video frame and its neighboring frames.
CVPR 2020    Code