Abstract
Due to the lack of enough real multi-agent data and time-consuming of labeling, existing multi-agent cooperative perception algorithms usually select the simulated sensor data for training and validating. However, the perception performance is degraded when these simulation-trained models are deployed to the real world, due to the significant domain gap between the simulated and real data. In this paper, we propose the first Simulation-to-Reality transfer learning framework for multi-agent cooperative perception using a novel Vision Transformer, named as S2R-ViT, which considers both the Deployment Gap and Feature Gap between simulated and real data. We investigate the effects of these two types of domain gaps and propose a novel uncertainty-aware vision transformer to effectively relief the Deployment Gap and an agent-based feature adaptation module with inter-agent and ego-agent discriminators to reduce the Feature Gap. Our intensive experiments on the public multi-agent cooperative perception datasets OPV2V and V2V4Real demonstrate that the proposed S2R-ViT can effectively bridge the gap from simulation to reality and outperform other methods significantly for point cloud-based 3D object detection.
| Original language | English |
|---|---|
| Title of host publication | Proceedings - IEEE International Conference on Robotics and Automation |
| Place of Publication | usa |
| Publisher | Institute of Electrical and Electronics Engineers Inc. |
| Pages | 16374-16380 |
| Number of pages | 7 |
| ISBN (Electronic) | 9798350384574 |
| DOIs | |
| State | Published - Jan 1 2024 |
| Event | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 - Yokohama, Japan Duration: May 13 2024 → May 17 2024 |
Conference
| Conference | 2024 IEEE International Conference on Robotics and Automation, ICRA 2024 |
|---|---|
| Country/Territory | Japan |
| City | Yokohama |
| Period | 05/13/24 → 05/17/24 |
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver