Submitted:
27 October 2023
Posted:
30 October 2023
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We proposed a key points self-features enhancement module to enhance the self-features of the key points. In this module, we introduce the multi-attention mechanisms to enhance the raw features and distance features to retain the semantic information of the key points as much as possible during each SA layer.
- We proposed an initial feature fusion module to extract the distance features of the point cloud and fuse the distance features into the raw features of the point sets. This module makes the features of the distant points more significant and thus improve detection accuracy of the distant instances.
- We revised the group aggregation module in the set abstraction. We made a second selection after the first selection of points within a fixed distance around the key point. In second selection we take the features into account to enhance the sampling effect of S-FPS.
2. Related
2.1. Grid-based methods

2.2. Point-based methods
3. Proposed Methods
3.1. Initial features fusion module
3.1.1. Distance features
3.1.2. Feature fusion
3.2. Key points features enhancement module
3.2.1. Feature Enhancement Module
3.3. Revised group aggregation module
3.3.1. Group aggregation method
3.4. Prediction Head
3.5. Loss
4. Experiment
4.1. Datasets
4.1.1. KITTI Dataset [9]
4.1.2. NuScenes Dataset [10]
4.1.3. Evaluation indicators
4.2. Experimental Setting
4.2.1. Setting in KITTI dataset
4.2.2. Setting in nuScenes dataset
4.3. Results
4.4. Enhancement Validation
4.5. Ablation Experiment
4.5.1. Initial feature fusion module
4.5.2. Key points self-features enhancement module
4.5.3. Revised group aggregation module
4.6. Detection effect
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Yang Z, Sun Y, Liu S, et al. 3dssd: Point-based 3d single stage object detector[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11040-11048.
- Chen C, Chen Z, Zhang J, et al. Sasa: Semantics-augmented set abstraction for point-based 3d object detection[C]//Proceedings of the AAAI Conference on Artificial Intelligence. 2022, 36(1): 221-229.
- Zhou Y, Tuzel O. Voxelnet: End-to-end learning for point cloud based 3d object detection[C]//Proceedings of the IEEE con-ference on computer vision and pattern recognition. 2018: 4490-4499.
- Shi S, Wang X, Li H. Pointrcnn: 3d object proposal generation and detection from point cloud[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 770-779.
- Lang A H, Vora S, Caesar H, et al. Pointpillars: Fast encoders for object detection from point clouds[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019: 12697-12705.
- Shi, G.; Li, R.; Ma, C. Pillarnet: High-performance pillar-based 3d object detection. arXiv 2022, arXiv:2205.07403. [Google Scholar]
- Qi C R, Su H, Mo K, et al. Pointnet: Deep learning on point sets for 3d classification and segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition. 2017: 652-660.
- Qi, C.R.; Yi, L.; Su, H.; et al. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Geiger, A.; Lenz, P.; Stiller, C.; et al. Vision meets robotics: The kitti dataset. Int. J. Robot. Res. 2013, 32, 1231–1237. [Google Scholar] [CrossRef]
- Caesar H, Bankiti V, Lang A H, et al. nuscenes: A multimodal dataset for autonomous driving[C]//Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020: 11621-11631.
- OD Team. Openpcdet: An open-source toolbox for 3d object detection from point clouds. OD Team 2020. [Google Scholar]
- Ding Z, Han X, Niethammer M. Votenet: A deep learning label fusion method for multi-atlas segmentation[C]//Medical Image Computing and Computer Assisted Intervention–MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part III 22. Springer International Publishing, 2019: 202-210.
- He C, Li R, Li S, et al. Voxel set transformer: A set-to-set approach to 3d object detection from point clouds[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022: 8417-8427.
- Yan, Y.; Mao, Y.; Li, B., 2nd. Sparsely embedded convolutional detection. Sensors 2018, 18, 3337. [Google Scholar] [CrossRef] [PubMed]
- Vaswani, A.; Shazeer, N.; Parmar, N.; et al. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Lee J, Lee Y, Kim J, et al. Set transformer: A framework for attention-based permutation-invariant neural net-works[C]//International conference on machine learning. PMLR, 2019: 3744-3753.
- Mao J, Xue Y, Niu M, et al. Voxel transformer for 3d object detection[C]//Proceedings of the IEEE/CVF International Con-ference on Computer Vision. 2021: 3164-3173.
- Simonelli A, Bulo S R, Porzi L, et al. Disentangling monocular 3d object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. 2019: 1991-1999.
- Salton, G.; McGill, M.J. Introduction to Modern Information Retrieval; McGraw-Hill, Inc.: New York, NY, USA, 1986. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]





| Methods | Car 3D AP (%) | ||
|---|---|---|---|
| Easy | Moderate | Hard | |
| SECOND | 84.656 | 75.966 | 68.712 |
| VoxelNet | 77.478 | 65.119 | 57.736 |
| PointPillars | 82.588 | 74.317 | 68.995 |
| PointRCNN | 89.023 | 78.246 | 77.554 |
| Vox Set Tran | 88.869 | 78.766 | 77.576 |
| SASA | 89.108 | 78.847 | 77.588 |
| SAE3D | 89.059 | 79.391 | 78.236 |
| Methods | NDS | mAP | Car | Truck | Bus | Trailer | C.V. | Ped. | Motor | Bicycle | T.C. | Barrier |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PointPillars | 45.2 | 25.8 | 70.3 | 32.9 | 44.9 | 18.5 | 4.2 | 46.8 | 14.8 | 0.6 | 7.5 | 21.3 |
| 3DSSD | 51.7 | 34.5 | 75.9 | 34.7 | 60.7 | 21.4 | 10.6 | 59.2 | 25.5 | 7.4 | 14.8 | 25.5 |
| SASA | 55.3 | 36.1 | 71.7 | 42.2 | 63.5 | 29.6 | 12.5 | 62.6 | 27.5 | 9.1 | 12.2 | 30.4 |
| SAE3D | 58.6 | 37.8 | 72.4 | 44.1 | 62.7 | 31.2 | 15.9 | 60.4 | 30.1 | 12.8 | 10.1 | 31.6 |
| Methods | Car 3D AP (%) | ||
|---|---|---|---|
| Easy | Moderate | Hard | |
| SASA | 89.108 | 78.847 | 77.588 |
| SASA+ SAE3D | 89.059 | 79.391 | 78.236 |
| Improvement | -0.049 | +0.544 | +0.648 |
| PointRCNN | 89.023 | 78.246 | 77.554 |
| PointRCNN+ SAE3D | 89.160 | 78.839 | 78.439 |
| Improvement | +0.137 | +0.593 | +0.885 |
| +I | +K | +F | Car 3D AP (%) | Car BBOX AP (%) | Car BEV AP (%) | Car AOS AP (%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | |||
| 89.108 | 78.847 | 77.588 | 96.742 | 89.855 | 89.036 | 90.199 | 87.855 | 85.993 | 96.71 | 89.75 | 88.88 | |||
| √ | 88.971 | 79.246 | 78.334 | 96.473 | 89.847 | 89.163 | 90.317 | 88.420 | 87.342 | 96.44 | 89.81 | 89.07 | ||
| √ | 89.213 | 79.324 | 78.114 | 96.813 | 90.171 | 89.412 | 89.876 | 89.397 | 86.976 | 96.54 | 90.08 | 89.11 | ||
| √ | 89.167 | 79.398 | 78.399 | 96.668 | 90.041 | 89.287 | 90.149 | 88.112 | 87.041 | 96.64 | 89.98 | 89.12 | ||
| √ | √ | √ | 89.059 | 79.391 | 78.236 | 96.758 | 90.169 | 89.382 | 89.978 | 88.382 | 86.824 | 96.71 | 90.10 | 89.22 |
| +I | +K | +F | Car 3D AP R40 (%) | Car BBOX AP R40(%) | Car BEV AP R40(%) | Car AOS AP R40(%) | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | Easy | Mod | Hard | |||
| 91.592 | 80.705 | 77.902 | 98.289 | 92.972 | 92.104 | 93.277 | 89.128 | 86.465 | 98.26 | 92.85 | 91.92 | |||
| √ | 91.457 | 83.067 | 80.369 | 98.111 | 94.583 | 92.469 | 95.055 | 91.034 | 88.832 | 98.09 | 94.52 | 92.35 | ||
| √ | 91.432 | 82.913 | 78.956 | 98.023 | 95.023 | 92.659 | 93.124 | 89.223 | 88.624 | 98.21 | 94.89 | 92.39 | ||
| √ | 91.555 | 83.254 | 80.484 | 98.097 | 94.948 | 92.637 | 93.197 | 89.423 | 88.722 | 98.08 | 94.85 | 92.44 | ||
| √ | √ | √ | 91.426 | 83.236 | 80.191 | 98.266 | 95.036 | 92.616 | 93.014 | 90.902 | 88.525 | 98.23 | 94.93 | 92.42 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).