We have developed several projects in the direction of using deep neural networks for learning to estimation optical flow, using both supervised and unsupervised training.
We have introduced Spatial Pyramid Networks (SpyNet) [ ], the smallest neural network in terms of memory for computing optical flow, trained on synthetic data. SpyNet is based on the idea that images can be broken down into spatial pyramids of different resolutions, and iterative computation across the pyramids is more efficient. SpyNet runs in real time, and was the best optical flow neural network at the time of release.
Synthetic data, used for training in SpyNet has a big domain gap compared to real data sequence. To counter that, we introduce IPFlow [ ], where we train a neural network to do temporal interpolation and then fine tune this network on a small number of labelled examples. This gives us better performance than the original SpyNet.
All of the above methods need supervision in terms of labelled examples that are difficult to obtain. Therefore, we tackle the problem by introducing geometric constraints for unsupervised training. In Competetive Collaboration framework [ ], we train four different network aimed at estimating monocular depth, camera pose, optical flow and motion segmentation. All of these models help in training each other synergistically in a completely unsupervised manner.
Furthermore, we also model occlusions and multiple frames in a video sequence for unsupervised learning of optical flow [ ].