Submitted:
06 August 2024
Posted:
07 August 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- A method is proposed to construct graphical data centered around the Transformer’s class token, where each output token from the Transformer is treated as a node, with the class token serving as the core. This approach offers good versatility and can be flexibly integrated into other Transformer-based models.
- A multi-scale graph attention-based person re-identification model is proposed, which greatly improves the recognition accuracy through integrating features from different image patches using a multi-scale graph attention module.
- Extensive experiments conducted on the occluded person ReID databases validate our proposed method, MSGA-ReID, as being effective in occluded ReID tasks.
2. Related Works
2.1. Occluded Person Re-Identification
2.2. Transformer-Based ReID
2.3. Graph-Based ReID
3. Methods
3.2. Feature Extraction
3.3. Graph Construction
3.4. Multi-Scale Graph Attention Feature Aggregation
3.5. Global Channel Attention
4. Experiments and Results
4.1. Data Sets
4.2. Data Augmentation
4.3. Implementation
4.3. Results of Data Set Verification
4.4. Comparison of Parameter Tuning
4.5. Visual Experiment
4.5.1. Examples of Inference on the Occluded Data Set
4.5.2. Visual Comparison of Feature Attention
5. Discussion/Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Yan, C.; Pang, G.; Jiao, J.; Bai, X.; Feng, X.; Shen, C. Occluded person re-identification with single-scale global representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 11875–11884. [Google Scholar]
- Chen, Y.C.; Zhu, X.; Zheng, W.S.; Lai, J.H. Person re-identification by camera correlation aware feature augmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 40, 392–408. [Google Scholar] [CrossRef]
- Peng, Y.; Wu, J.; Xu, B.; Cao, C.; Liu, X.; Sun, Z.; He, Z. Deep Learning Based Occluded Person Re-Identification: A Survey. ACM Trans. Multimed. Comput. Commun. Appl. 2023, 20, 1–27. [Google Scholar] [CrossRef]
- Ning, E.; Wang, C.; Zhang, H.; Ning, X.; Tiwari, P. Occluded person re-identification with deep learning: A survey and perspectives. Expert Syst. Appl. 2023, 239, 122419. [Google Scholar] [CrossRef]
- Li, W.; Zou, C.; Wang, M.; Xu, F.; Zhao, J.; Zheng, R.; Cheng, Y.; Chu, W. Dc-former: Diverse and compact transformer for person re-identification. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 1415–1423. [Google Scholar]
- He, S.; Luo, H.; Wang, P.; Wang, F.; Li, H.; Jiang, W. Transreid: Transformer-based object re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 15013–15022. [Google Scholar]
- Xu, B.; He, L.; Liang, J.; Sun, Z. Learning feature recovery transformer for occluded person re-identification. IEEE Trans. Image Process. 2022, 31, 4651–4662. [Google Scholar] [CrossRef] [PubMed]
- Zhu, K.; Guo, H.; Zhang, S.; Wang, Y.; Liu, J.; Wang, J.; Tang, M. Aaformer: Auto-aligned transformer for person re-identification. In IEEE Transactions on Neural Networks and Learning Systems; IEEE: Piscataway, NJ, USA, 2023. [Google Scholar]
- Li, Y.; He, J.; Zhang, T.; Liu, X.; Zhang, Y.; Wu, F. Diverse part discovery: Occluded person re-identification with part-aware transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19–25 June 2021; pp. 2898–2907. [Google Scholar]
- Gao, H.; Hu, C.; Han, G.; Mao, J.; Huang, W.; Guan, Q. Point-level feature learning based on vision transformer for occluded person re-identification. Image Vis. Comput. 2024, 143, 104929. [Google Scholar] [CrossRef]
- Wang, P.; Zhao, Z.; Su, F.; Zu, X.; Boulgouris, N.V. Horeid: Deep high-order mapping enhances pose alignment for person re-identification. IEEE Trans. Image Process. 2021, 30, 2908–2922. [Google Scholar] [CrossRef] [PubMed]
- Zhu, M.; Zhou, H. EcReID: Enhancing Correlations from Skeleton for Occluded Person Re-Identification. Symmetry 2023, 15, 906. [Google Scholar] [CrossRef]
- Gao, S.; Wang, J.; Lu, H.; Liu, Z. Pose-guided visible part matching for occluded person reid. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11744–11752. [Google Scholar]
- Chen P, Liu W, Dai P; Liu, J.; Ye, Q.; Xu, M.; Chen, Q.; Ji, R. Occlude them all: Occlusion-aware attention network for occluded person re-id. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 11833–11842. [Google Scholar]
- Yang, J.; Zhang, C.; Tang, Y.; Li, Z. PAFM: Pose-drive attention fusion mechanism for occluded person re-identification. Neural Comput. Appl. 2022, 34, 8241–8252. [Google Scholar] [CrossRef]
- Hou, R.; Ma, B.; Chang, H.; Gu, X.; Shan, S.; Chen, X. Feature completion for occluded person re-identification. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4894–4912. [Google Scholar] [CrossRef] [PubMed]
- Wang, T.; Liu, H.; Song, P.; Guo, T.; Shi, W. Pose-guided feature disentangling for occluded person re-identification based on transformer. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 22 February–1 March 2022; Volume 36, pp. 2540–2549. [Google Scholar]
- Miao, J.; Wu, Y.; Yang, Y. Identifying visible parts via pose estimation for occluded person re-identification. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 4624–4634. [Google Scholar] [CrossRef] [PubMed]
- Zhai, Y.; Han, X.; Ma, W.; Gou, X.; Xiao, G. Pgmanet: Pose-guided mixed attention network for occluded person re-identification. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN), Virtual, 18–22 July 2021; Version July 6, 2024 Submitted to Journal Not Specified 15 of 15. pp. 1–8. [Google Scholar]
- Sun, Y.; Zheng, L.; Yang, Y.; Tian, Q.; Wang, S. Beyond part models: Person retrieval with refined part pooling (and a strong convolutional baseline). In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 480–496. [Google Scholar]
- Veličković, P.; Cucurull, G.; Casanova, A.; Romero, A.; Lio, P.; Bengio, Y. Graph attention networks. arXiv 2017, arXiv:1710.10903. [Google Scholar]
- Wang, Z.; Chen, J.; Chen, H. EGAT: Edge-featured graph attention network. In Artificial Neural Networks and Machine Learning–ICANN 2021, Proceedings of the 30th International Conference on Artificial Neural Networks, Bratislava, Slovakia, 14–17 September 2021; Proceedings, Part I 30; Springer International Publishing: Berlin/Heidelberg, Germany, 2021; pp. 253–264. [Google Scholar]
- Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.
- Han, K.; Wang, Y.; Guo, J.; Tang, Y.; Wu, E. Vision GNN: An image is worth graph of nodes. Adv. Neural Inf. Process. Syst. 2022, 35, 8291–8303. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-excitation networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7132–7141. [Google Scholar]
- Kipf, T.N.; Welling, M. Semi-supervised classification with graph convolutional networks. arXiv 2016, arXiv:1609.02907. [Google Scholar]
- Miao, J.; Wu, Y.; Liu, P.; Ding, Y.; Yang, Y. Pose-guided feature alignment for occluded person re-identification. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 542–551. [Google Scholar]
- Zhuo, J.; Chen, Z.; Lai, J.; Wang, G. Occluded person re-identification. In Proceedings of the 2018 IEEE International Conference on Multimedia and Expo (ICME), San Diego, CA, USA, 23–27 July 2018; pp. 1–6. [Google Scholar]
- Zheng, L.; Shen, L.; Tian, L.; Wang, S.; Wang, J.; Tian, Q. Scalable person re-identification: A benchmark. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 1116–1124. [Google Scholar]
- He, L.; Liang, J.; Li, H.; Sun, Z. Deep spatial feature reconstruction for partial person re-identification: Alignment-free approach. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 7073–7082. [Google Scholar]
- He, L.; Wang, Y.; Liu, W.; Zhao, H.; Sun, Z.; Feng, J. Foreground-aware pyramid reconstruction for alignment-free occluded person re-identification. In Proceedings of the IEEE/CVF International Conference On Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8450–8459. [Google Scholar]
- Wang, P.; Ding, C.; Shao, Z.; Hong, Z.; Zhang, S.; Tao, D. Quality-aware part models for occluded person re-identification. IEEE Trans. Multimed. 2022, 25, 3154–3165. [Google Scholar] [CrossRef]
- Jia, M.; Cheng, X.; Zhai, Y.; Lu, S.; Ma, S.; Tian, Y.; Zhang, J. Matching on sets: Conquer occluded person re-identification without alignment. In Proceedings of the AAAI Conference on Artificial Intelligence, Virtual, 2–9 February 2021; Volume 35, pp. 1673–1681. [Google Scholar]
- Wang, Y.; Wang, L.; Zhou, Y. Bi-level deep mutual learning assisted multi-task network for occluded person re-identification. IET Image Process. 2023, 17, 979–987. [Google Scholar] [CrossRef]





| Occ_Duke | Occ_ReID | Partial_ReID | ||||
|---|---|---|---|---|---|---|
| Method | mAP | Rank-1 | mAP | Rank-1 | Rank-1 | |
| Multi-scale CNN | DSR [30] | 30.4 | 40.8 | 62.8 | 72.8 | 50.7 |
| FPR [31] | - | - | 68.0 | 78.3 | 68.1 | |
| Transformer | TransReID [6] | 55.7 | 64.2 | 67.3 | 81.6 | 68.6 |
| DC-former [5] | 56.6 | 63.3 | 45.7 | 49.0 | 73.0 | |
| PAT [9] | 53.6 | 64.5 | 72.1 | 70.2 | - | |
| PVT [10] | 57.6 | 65.5 | 74.0 | 79.1 | 81.0 | |
| Spatial key point graph | PVPM [13] | - | - | 61.2 | 70.4 | 78.3 |
| OAMN [14] | 46.1 | 62.6 | - | - | 86.0 | |
| PAFM [15] | 42.3 | 55.1 | 68.0 | 76.4 | 82.5 | |
| HOReID [11] | 43.8 | 55.1 | 70.2 | 80.3 | 85.3 | |
| RFCNet [16] | 54.5 | 63.9 | - | - | - | |
| EcReID [12] | 52.7 | 64.8 | 75.1 | 84.5 | 81.0 | |
| Others | QPM [32] | 49.7 | 64.4 | - | - | 81.7 |
| MoS [33] | 55.1 | 66.6 | - | - | - | |
| BMM [34] | 55.6 | 63.4 | - | - | 73.7 | |
| ours* | 56.9 | 65.5 | 79.3 | 83.0 | 84.9 | |
| ours | 57.1 | 66.8 | 77.2 | 81.3 | 85.3 | |
| Occ_Duke | Occ_ReID | Partial_ReID | |||
|---|---|---|---|---|---|
| mAP | Rank-1 | mAP | Rank-1 | Rank-1 | |
| ours (no_aug) | 51.5 | 65.0 | 69.8 | 78.9 | 83.0 |
| ours (aug) | 57.1 | 66.8 | 77.2 | 81.3 | 85.3 |
| Num of MSGA Layers | Occ_Duke | Occ_ReID | Partial_ReID | ||
|---|---|---|---|---|---|
| mAP | Rank-1 | mAP | Rank-1 | Rank-1 | |
| 1 | 56.1 | 66.3 | 76.6 | 80.8 | 85.1 |
| 2 | 56.5 | 65.7 | 76.8 | 80.9 | 84.8 |
| 3 | 57.1 | 66.8 | 77.2 | 81.3 | 85.3 |
| Num of Attention Heads | Occ_Duke | Occ_ReID | Partial_ReID | ||
|---|---|---|---|---|---|
| mAP | Rank-1 | mAP | Rank-1 | Rank-1 | |
| 2 | 53.3 | 63.3 | 76.4 | 81.1 | 83.5 |
| 4 | 56.9 | 65.5 | 79.3 | 83.0 | 84.9 |
| 6 | 56.5 | 66.4 | 77.3 | 82.5 | 85.6 |
| 8 | 57.1 | 66.8 | 77.2 | 81.3 | 85.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).