Skip to main navigation Skip to search Skip to main content

Co-Fix3D: Enhancing 3D Object Detection With Collaborative Refinement

  • Wenxuan Li
  • , Qin Zou
  • , Chi Chen
  • , Bo Du
  • , Long Chen
  • , Jian Zhou
  • , Hongkai Yu
  • Wuhan University
  • Institute of Automation Chinese Academy of Sciences
  • Cleveland State University

Research output: Contribution to journalArticlepeer-review

Abstract

3D object detection in driving scenarios is particularly challenging due to factors such as sensor noise, occlusions, and the inherent sparsity of LiDAR point clouds, which can lead to the loss or incompleteness of key features, in turn affecting perception performance. To address these challenges, we propose Co-Fix3D, an advanced detection framework that integrates Local and Global Enhancement (LGE) modules to refine Bird's Eye View (BEV) features. The LGE module employs Discrete Wavelet Transform (DWT) to refine local features at a fine scale, which helps capture frequency details and subtle variations in the environment, and incorporates an attention mechanism to enhance global feature representations across the entire scene. Moreover, we adopt multi-head LGE modules that each concentrate on targets with varying levels of detection difficulty, further improving our overall perception performance. On the nuScenes dataset, Co-Fix3D achieves a new SOTA performance with 69.4% mAP and 73.5% NDS compared to other competing methods, while on the multimodal benchmark, it achieves 72.3% mAP and 74.7% NDS, respectively.
Original languageEnglish
Pages (from-to)4970-4977
Number of pages8
JournalIEEE Robotics and Automation Letters
Volume10
Issue number5
DOIs
StatePublished - Jan 1 2025

Keywords

  • deep learning methods, sensor fusion
  • Object detection, segmentation and categorization

Cite this