Abstract
Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task. Existing blur map-based deblurring methods have demonstrated promising results. In this paper, we propose, to the best of our knowledge, the first framework that introduces the contrastive language-image pre-training framework (CLIP) to accurately estimate the blur map from a DP pair unsu-pervisedly. To achieve this, we first carefully design text prompts to enable CLIP to understand blur-related geo-metric prior knowledge from the DP pair. Then, we pro-pose a format to input a stereo DP pair to CLIP without any fine-tuning, despite the fact that CLIP is pre-trained on monocular images. Given the estimated blur map, we intro-duce a blur-prior attention block, a blur-weighting loss, and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig. 1).
| Original language | English |
|---|---|
| Pages (from-to) | 24078-24087 |
| Number of pages | 10 |
| Journal | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition |
| DOIs | |
| Publication status | Published - 2024 |
| Event | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States Duration: 16 Jun 2024 → 22 Jun 2024 |
Fingerprint
Dive into the research topics of 'LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver