Learning to Select Views for Efficient Multi-View Understanding

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Citations (Scopus)

Abstract

Multiple camera view (multi-view) setups have proven useful in many computer vision applications. However, the high computational cost associated with multiple views creates a significant challenge for end devices with limited computational resources. In modern CPU, pipelining breaks a longer job into steps and enables parallelism over sequential steps from multiple jobs. Inspired by this, we study selective view pipelining for efficient multi-view understanding, which breaks computation of multiple views into steps, and only computes the most helpful views/steps in a parallel manner for the best efficiency. To this end, we use reinforcement learning to learn a very light view selection module that analyzes the target object or scenario from initial views and selects the next-best-view for recognition or detection for pipeline computation. Experimental results on multi-view classification and detection tasks show that our approach achieves promising performance while using only 2 or 3 out of N available views, significantly reducing computational costs while maintaining parallelism over GPU through selective view pipelining11Code available at https://github.com/hou-yz/MVSelect.

Original languageEnglish
Title of host publicationProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
PublisherIEEE Computer Society
Pages20135-20144
Number of pages10
ISBN (Electronic)9798350353006
DOIs
Publication statusPublished - 2024
Event2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 - Seattle, United States
Duration: 16 Jun 202422 Jun 2024

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
ISSN (Print)1063-6919

Conference

Conference2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Country/TerritoryUnited States
CitySeattle
Period16/06/2422/06/24

Fingerprint

Dive into the research topics of 'Learning to Select Views for Efficient Multi-View Understanding'. Together they form a unique fingerprint.

Cite this