TY - JOUR
T1 - Adversarial spatio-temporal learning for video deblurring
AU - Zhang, Kaihao
AU - Luo, Wenhan
AU - Zhong, Yiran
AU - Ma, Lin
AU - Liu, Wei
AU - Li, Hongdong
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2019/1
Y1 - 2019/1
N2 - Camera shake or target movement often leads to undesired blur effects in videos captured by a hand-held camera. Despite significant efforts having been devoted to video-deblur research, two major challenges remain: 1) how to model the spatio-temporal characteristics across both the spatial domain (i.e., image plane) and the temporal domain (i.e., neighboring frames) and 2) how to restore sharp image details with respect to the conventionally adopted metric of pixel-wise errors. In this paper, to address the first challenge, we propose a deblurring network (DBLRNet) for spatial-temporal learning by applying a 3D convolution to both the spatial and temporal domains. Our DBLRNet is able to capture jointly spatial and temporal information encoded in neighboring frames, which directly contributes to the improved video deblur performance. To tackle the second challenge, we leverage the developed DBLRNet as a generator in the generative adversarial network (GAN) architecture and employ a content loss in addition to an adversarial loss for efficient adversarial training. The developed network, which we name as deblurring GAN, is tested on two standard benchmarks and achieves the state-of-the-art performance.
AB - Camera shake or target movement often leads to undesired blur effects in videos captured by a hand-held camera. Despite significant efforts having been devoted to video-deblur research, two major challenges remain: 1) how to model the spatio-temporal characteristics across both the spatial domain (i.e., image plane) and the temporal domain (i.e., neighboring frames) and 2) how to restore sharp image details with respect to the conventionally adopted metric of pixel-wise errors. In this paper, to address the first challenge, we propose a deblurring network (DBLRNet) for spatial-temporal learning by applying a 3D convolution to both the spatial and temporal domains. Our DBLRNet is able to capture jointly spatial and temporal information encoded in neighboring frames, which directly contributes to the improved video deblur performance. To tackle the second challenge, we leverage the developed DBLRNet as a generator in the generative adversarial network (GAN) architecture and employ a content loss in addition to an adversarial loss for efficient adversarial training. The developed network, which we name as deblurring GAN, is tested on two standard benchmarks and achieves the state-of-the-art performance.
KW - Spatio-temporal learning
KW - adversarial learning
KW - video deblurring
UR - http://www.scopus.com/inward/record.url?scp=85052647255&partnerID=8YFLogxK
U2 - 10.1109/TIP.2018.2867733
DO - 10.1109/TIP.2018.2867733
M3 - Article
SN - 1057-7149
VL - 28
SP - 291
EP - 301
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
IS - 1
M1 - 8449842
ER -