TY - GEN
T1 - EpO-Net
T2 - 2020 IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2020
AU - Faisal, Muhammad
AU - Akhter, Ijaz
AU - Ali, Mohsen
AU - Hartley, Richard
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/3
Y1 - 2020/3
N2 - The existing approaches for salient motion segmentation are unable to explicitly learn geometric cues and often give false detections on prominent static objects. We exploit multiview geometric constraints to avoid such shortcomings. To handle the nonrigid background like a sea, we also propose a robust fusion mechanism between motion and appearance-based features. We find dense trajectories, covering every pixel in the video, and propose trajectory-based epipolar distances to distinguish between background and foreground regions. Trajectory epipolar distances are dataindependent and can be readily computed given a few features' correspondences between the images. We show that by combining epipolar distances with optical flow, a powerful motion network can be learned. Enabling the network to leverage both of these features, we propose a simple mechanism, we call input-dropout. Comparing the motion-only networks, we outperform the previous state of the art on DAVIS-2016 dataset by 5.2% in the mean IoU score. By robustly fusing our motion network with an appearance network using the input-dropout mechanism, we also outperform the previous methods on DAVIS-2016, 2017 and Segtrackv2 dataset.
AB - The existing approaches for salient motion segmentation are unable to explicitly learn geometric cues and often give false detections on prominent static objects. We exploit multiview geometric constraints to avoid such shortcomings. To handle the nonrigid background like a sea, we also propose a robust fusion mechanism between motion and appearance-based features. We find dense trajectories, covering every pixel in the video, and propose trajectory-based epipolar distances to distinguish between background and foreground regions. Trajectory epipolar distances are dataindependent and can be readily computed given a few features' correspondences between the images. We show that by combining epipolar distances with optical flow, a powerful motion network can be learned. Enabling the network to leverage both of these features, we propose a simple mechanism, we call input-dropout. Comparing the motion-only networks, we outperform the previous state of the art on DAVIS-2016 dataset by 5.2% in the mean IoU score. By robustly fusing our motion network with an appearance network using the input-dropout mechanism, we also outperform the previous methods on DAVIS-2016, 2017 and Segtrackv2 dataset.
UR - http://www.scopus.com/inward/record.url?scp=85085504427&partnerID=8YFLogxK
U2 - 10.1109/WACV45572.2020.9093589
DO - 10.1109/WACV45572.2020.9093589
M3 - Conference contribution
T3 - Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
SP - 1873
EP - 1882
BT - Proceedings - 2020 IEEE Winter Conference on Applications of Computer Vision, WACV 2020
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 1 March 2020 through 5 March 2020
ER -