TY - JOUR
T1 - MSNet
T2 - Multi-Scale Network for Object Detection in Remote Sensing Images
AU - Gao, Tao
AU - Xia, Shilin
AU - Liu, Mengkun
AU - Zhang, Jing
AU - Chen, Ting
AU - Li, Ziqi
N1 - DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2025/2
Y1 - 2025/2
N2 - Remote sensing object detection (RSOD) encounters challenges in effectively extracting features of small objects in remote sensing images (RSIs). To alleviate these problems, we proposed a Multi-Scale Network for Object Detection in Remote Sensing Images (MSNet) with multi-dimension feature information. Firstly, we design a Partial and Pointwise Convolution Extraction Module (P2CEM) 2 CEM) to capture feature of object in spatial and channel dimension simultaneously. Secondly, we design a Local and Global Information Fusion Module (LGIFM), designed local information stack and context modeling module to capture texture information and semantic information within the multi-scale feature maps respectively. Moreover, the LGIFM enhances the ability of representing features for small objects and objects within complex backgrounds by allocating weights between local and global information. Finally, we introduce Local and Global Information Fusion Pyramid (LGIFP). With the aid of the LGIFM, the LGIFP enhances the feature representation of small object information, which contributes to dense connection across the multi-scale feature maps. Extensive experiments validate that our proposed method outperforms state-of-the-art performance. Specifically, MSNet achieves mean average precision (mAP) scores of 75.3%, 93.39%, 96.00%, and 95.62% on the DIOR, HRRSD, NWPU VHR-10, and RSOD datasets, respectively.
AB - Remote sensing object detection (RSOD) encounters challenges in effectively extracting features of small objects in remote sensing images (RSIs). To alleviate these problems, we proposed a Multi-Scale Network for Object Detection in Remote Sensing Images (MSNet) with multi-dimension feature information. Firstly, we design a Partial and Pointwise Convolution Extraction Module (P2CEM) 2 CEM) to capture feature of object in spatial and channel dimension simultaneously. Secondly, we design a Local and Global Information Fusion Module (LGIFM), designed local information stack and context modeling module to capture texture information and semantic information within the multi-scale feature maps respectively. Moreover, the LGIFM enhances the ability of representing features for small objects and objects within complex backgrounds by allocating weights between local and global information. Finally, we introduce Local and Global Information Fusion Pyramid (LGIFP). With the aid of the LGIFM, the LGIFP enhances the feature representation of small object information, which contributes to dense connection across the multi-scale feature maps. Extensive experiments validate that our proposed method outperforms state-of-the-art performance. Specifically, MSNet achieves mean average precision (mAP) scores of 75.3%, 93.39%, 96.00%, and 95.62% on the DIOR, HRRSD, NWPU VHR-10, and RSOD datasets, respectively.
KW - Deep feature fusion
KW - Feature representation
KW - Multi-scale object detection
KW - Small object detection
UR - http://www.scopus.com/inward/record.url?scp=85203633759&partnerID=8YFLogxK
U2 - 10.1016/j.patcog.2024.110983
DO - 10.1016/j.patcog.2024.110983
M3 - Article
AN - SCOPUS:85203633759
SN - 0031-3203
VL - 158
SP - 110983
JO - Pattern Recognition
JF - Pattern Recognition
M1 - 110983
ER -