Submitted:
26 September 2024
Posted:
26 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Datasets and Algorithms
2.1. Summary of Datasets and Algorithms
2.2. Deep-CAPTCHA and Adaptive-CAPTCHA
2.3. M-CAPTCHA and P-CAPTCHA
2.4. Color Shift Algorithms
2.5. AE
2.6. LSKA
2.7. Lightweight Attention Mechanisms
3. Methods
3.1. VCS
3.2. Sim-VCS
3.3. Dilated-VCS
3.4. AE-LSKA
4. Results and Discussions
4.1. Experimental Analysis of VCS
4.2. Experimental Analysis of AE-LSKA
5. Conclusions
Author Contributions
Data Availability Statement
Conflicts of Interest
References
- Setiawan, A.B.; Sastrosubroto, A.S. Strengthening the Security of Critical Data in Cyberspace, a Policy Review. In Proceedings of the 2016 International Conference on Computer, Control, Informatics and its Applications (IC3INA); October 2016; pp. 185–190.
- von Ahn, L.; Blum, M.; Hopper, N.J.; Langford, J. CAPTCHA: Using Hard AI Problems for Security. In Proceedings of the Advances in Cryptology — EUROCRYPT 2003; Biham, E., Ed.; Springer: Berlin, Heidelberg, 2003; pp. 294–311.
- Yan, J.; El Ahmad, A.S. Usability of CAPTCHAs or Usability Issues in CAPTCHA Design. In Proceedings of the Proceedings of the 4th symposium on Usable privacy and security - SOUPS ’08; ACM Press: Pittsburgh, Pennsylvania, 2008; p. 44.
- Alsuhibany, S.A. Evaluating the Usability of Optimizing Text-Based CAPTCHA Generation. Int. J. Adv. Comput. Sci. Appl. IJACSA 2016, 7. [CrossRef]
- Wang, J.; Qin, J.; Xiang, X.; Tan, Y.; Pan, N.; College of Computer Science and Information Technology, Central South University of Forestry and Technology, 498 shaoshan S Rd, Changsha, 410004, China CAPTCHA Recognition Based on Deep Convolutional Neural Network. Math. Biosci. Eng. 2019, 16, 5851–5861. [CrossRef]
- Guerar, M.; Verderame, L.; Migliardi, M.; Palmieri, F.; Merlo, A. Gotta CAPTCHA ’Em All: A Survey of 20 Years of the Human-or-Computer Dilemma. ACM Comput. Surv. 2022, 54, 1–33. [CrossRef]
- Chellapilla, K.; Larson, K.; Simard, P.Y.; Czerwinski, M. Building Segmentation Based Human-Friendly Human Interaction Proofs (HIPs). In Proceedings of the Human Interactive Proofs; Baird, H.S., Lopresti, D.P., Eds.; Springer: Berlin, Heidelberg, 2005; pp. 1–26.
- Zhang, J.; Sang, J.; Xu, K.; Wu, S.; Zhao, X.; Sun, Y.; Hu, Y.; Yu, J. Robust CAPTCHAs Towards Malicious OCR. IEEE Trans. Multimed. 2021, 23, 2575–2587. [CrossRef]
- Wang, P.; Gao, H.; Guo, X.; Xiao, C.; Qi, F.; Yan, Z. An Experimental Investigation of Text-Based CAPTCHA Attacks and Their Robustness. ACM Comput Surv 2023, 55, 196:1-196:38. [CrossRef]
- Xing, W.; Mohd, M.R.S.; Johari, J.; Ruslan, F.A. A Review on Text-Based CAPTCHA Breaking Based on Deep Learning Methods. In Proceedings of the 2023 International Conference on Computer Engineering and Distance Learning (CEDL); June 2023; pp. 171–175.
- Ge, Z.; Liu, S.; Wang, F.; Li, Z.; Sun, J. YOLOX: Exceeding YOLO Series in 2021 2021.
- Li, C.; Li, L.; Jiang, H.; Weng, K.; Geng, Y.; Li, L.; Ke, Z.; Li, Q.; Cheng, M.; Nie, W.; et al. YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications 2022.
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors 2022.
- Walia, J.S.; Odugoudar, A. Vulnerability Analysis of Captcha Using Deep Learning. In Proceedings of the 2023 IEEE International Conference on ICT in Business Industry & Government (ICTBIG); December 8 2023; pp. 1–7.
- Wang, Z.; Wang, P.; Liu, K.; Wang, P.; Fu, Y.; Lu, C.-T.; Aggarwal, C.C.; Pei, J.; Zhou, Y. A Comprehensive Survey on Data Augmentation 2024.
- Bursztein, E.; Martin, M.; Mitchell, J.C. Text-Based CAPTCHA Strengths and Weaknesses. In Proceedings of the Proceedings of the 18th Acm Conference on Computer & Communications Security (ccs 11); Assoc Computing Machinery: New York, 2011; pp. 125–137.
- Mocanu, I.G.; Yang, Z.; Belle, V. Breaking CAPTCHA with Capsule Networks. Neural Netw. 2022, 154, 246–254. [CrossRef]
- Shi, Y.; Liu, X.; Han, S.; Lu, Y.; Zhang, X. A Transformer Network for CAPTCHA Recognition. In Proceedings of the 2021 2nd International Conference on Artificial Intelligence and Information Systems; Association for Computing Machinery: New York, NY, USA, 2021; pp. 1–5.
- Qing, K.; Zhang, R. An Efficient ConvNet for Text-Based CAPTCHA Recognition. In Proceedings of the 2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS); November 2022; pp. 1–4.
- Noury, Z.; Rezaei, M. Deep-CAPTCHA: A Deep Learning Based CAPTCHA Solver for Vulnerability Assessment 2020.
- Wan, X.; Johari, J.; Ruslan, F.A. Adaptive CAPTCHA: A CRNN-Based Text CAPTCHA Solver with Adaptive Fusion Filter Networks. Appl. Sci. 2024, 14, 5016. [CrossRef]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. Commun. Acm 2017, 60, 84–90. [CrossRef]
- Wang, X.; Yu, J. Learning to Cartoonize Using White-Box Cartoon Representations. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); June 2020; pp. 8087–8096.
- Ishkov, D.O.; Terekhov, V.I. Text CAPTCHA Traversal with ConvNets: Impact of Color Channels. In Proceedings of the 2022 4th International Youth Conference on Radio Electronics, Electrical and Power Engineering (REEPE); March 2022; pp. 1–5.
- Chen, S.; Guo, W. Auto-Encoders in Deep Learning—A Review with New Perspectives. Mathematics 2023, 11, 1777. [CrossRef]
- Lau, K.W.; Po, L.-M.; Rehman, Y.A.U. Large Separable Kernel Attention: Rethinking the Large Kernel Attention Design in CNN. Expert Syst Appl 2024, 236. [CrossRef]
- Guo, M.-H.; Lu, C.-Z.; Liu, Z.-N.; Cheng, M.-M.; Hu, S.-M. Visual Attention Network. Comput. Vis. Media 2023, 9, 733–752. [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale 2021.
- Liu, Z.; Mao, H.; Wu, C.-Y.; Feichtenhofer, C.; Darrell, T.; Xie, S. A ConvNet for the 2020s 2022.
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the Computer Vision – ECCV 2018; Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y., Eds.; Springer International Publishing: Cham, 2018; pp. 3–19.
- Hu, J.; Shen, L.; Albanie, S.; Sun, G.; Wu, E. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [CrossRef]
- Zhang, Q.-L.; Yang, Y.-B. SA-Net: Shuffle Attention for Deep Convolutional Neural Networks. In Proceedings of the ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP); June 2021; pp. 2235–2239.
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks 2020.
- Goyal, A.; Bochkovskiy, A.; Deng, J.; Koltun, V. Non-Deep Networks 2021.
- Cao, Y.; Xu, J.; Lin, S.; Wei, F.; Hu, H. GCNet: Non-Local Networks Meet Squeeze-Excitation Networks and Beyond 2019.
- Kingma, D.P.; Welling, M. An Introduction to Variational Autoencoders. Found. Trends® Mach. Learn. 2019, 12, 307–392. [CrossRef]
















| Dataset/ Model | Description |
|---|---|
| Adaptive-CAPTCHA | Strong CAPTCHA recognizer |
| Deep-CAPTCHA | Weak CAPTCHA recognizer |
| M-CAPTCHA | Complex CAPTCHA dataset |
| P-CAPTCHA | Simple CAPTCHA dataset |
| AE-LSKA | New Feature extraction module |
| VCS, Sim-VCS, Dilated-VCS | New color shift methods |
| Type | Input | Dilated Rate | Stride | Padding | PARAMs | FLOPs | ||
|---|---|---|---|---|---|---|---|---|
| Dilated-VCS | (64,192) | (4,4) | (7,21) | (22,64) | (1,0) | 450 | 5514 | |
| Dilated-VCS | (64,192) | (8,8) | (3,9) | (22,64) | (1,0) | 1314 | 21066 | |
| Dilated-VCS | (64,192) | (22,22) | (1,3) | (22,64) | (1,0) | 8874 | 156978 | |
| VCS | (64,192) | (64,64) | (1,1) | (64,64) | (0,0) | 162 | 73728 |
| / | Conv 5x5 | AE-LSKA5 | AE-LSKA7 | AE-LSKA11 | AE-LSKA15 | AE-LSKA23 |
|---|---|---|---|---|---|---|
| Kernel (LSKA) | / | |||||
| Dilation | / | 2 | 2 | 2 | 3 | 3 |
| PARAMs | () | |||||
| FLOPs |
| Name | Model/Dataset/Specification | Version |
|---|---|---|
| Graphics Processing Unit (GPU) | NVIDIA GeForce RTX 3060 12GB | - |
| Central Processing Unit (CPU) | Intel(R) Core (TM) i5-8265U CPU @ 1.60GHz 1.80 GHz | - |
| PyTorch | - | 2.2 |
| Python | - | 3.11.9 |
| Text CAPTCHA Datasets | M-CAPTCHA (5000 images), P-CAPTCHA (3000 images) | - |
| EPOCHs | 130 | |
| Text CAPTCHA Recognizers | Deep-CAPTCHA, Adaptive-CAPTCHA | - |
| Metrics | AASR, loss, PARAMs, FLOPs | - |
| Algorithm | Dilated Kernel |
Dropout | Model | Train | Test AASR (%)P-CAPTCHA | Train | Test AASR (%) M-CAPTCHA |
|---|---|---|---|---|---|
| Dilated-VCS | 4 | 0.0 | Deep-CAPTCHA | 96.9|37.0 | 95.0|26.2 |
| 4 | 0.3 | 97.2|36.1 | 95.6|26.6 | ||
| 8 | 0.0 | 96.4|37.8 | 94.6|27.2 | ||
| 8 | 0.3 | 97.3|37.9 | 94.8|25.7 | ||
| 22 | 0.0 | 97.2|36.8 | 93.3|25.2 | ||
| 22 | 0.3 | 97.0|39.1 | 93.7|24.0 | ||
| 4 | 0.0 | Adaptive-CAPTCHA | 99.9|73.2 | 99.9|69.2 | |
| 4 | 0.3 | 99.9|73.1 | 99.9|72.0 | ||
| 8 | 0.0 | 99.9|71.6 | 99.9|70.1 | ||
| 8 | 0.3 | 99.9|74.0 | 99.9|69.0 | ||
| 22 | 0.0 | 99.9|67.6 | 99.9|57.6 | ||
| 22 | 0.3 | 99.9|68.2 | 99.9|55.6 | ||
| VCS | - | - | Deep-CAPTCHA | 94.0|35.8 | 93.7|26.6 |
| - | - | Adaptive-CAPTCHA | 99.9|74.5 | 99.9|78.0 | |
| Sim-VCS | - | - | Deep-CAPTCHA | 95.6|35.6 | 93.4|26.2 |
| - | - | Adaptive-CAPTCHA | 99.9|72.4 | 99.9|76.9 |
| Algorithm | Dataset | PARAMs | FLOPs | AASR (%) |
|---|---|---|---|---|
| AE + LSKA (k=7) | Deep-CAPTCHA | 6.41M | 176.02M | 28.4 |
| AE + LSKA (k=11) | 6.41M | 178.38M | 32.9 | |
| AE + CBAM (ratio=8) | 6.40M | 148.37M | 25.0 | |
| AE + CBAM (ratio=16) | 6.40M | 148.37M | 31.2 | |
| AE + ECA (ratio=2) | 6.40M | 146.72M | 26.1 | |
| AE + ECA (ratio=4) | 6.40M | 146.72M | 21.1 | |
| AE + GC (ATT + ADD) | 6.41M | 146.78M | 28.7 | |
| AE + GC (AVG + MUL) | 6.41M | 146.73M | 23.8 | |
| AE + SA (groups=8) | 6.40M | 146.43M | 25.4 | |
| AE + SA (groups=16) | 6.40M | 146.45M | 22.9 | |
| AE + SE (ratio=8) | 6.40M | 146.72M | 20.9 | |
| AE + SE (ratio=16) | 6.40M | 146.72M | 23.0 | |
| AE + PNA | 6.48M | 379.51M | 48.9 | |
| AE + LSKA (k=7) | Adaptive-CAPTCHA | 3.39M | 229.59M | 89.8 |
| AE + LSKA (k=11) | 3.39M | 232.10M | 88.3 | |
| AE + CBAM (ratio=8) | 3.31M | 195.32M | 85.4 | |
| AE + CBAM (ratio=16) | 3.30M | 195.29M | 77.3 | |
| AE + ECA (ratio=2) | 3.29M | 193.60M | 82.6 | |
| AE + ECA (ratio=4) | 3.29M | 193.60M | 85.9 | |
| AE + GC (ATT + ADD) | 3.38M | 193.74M | 66.3 | |
| AE + GC (AVG + MUL) | 3.38M | 193.69M | 81.8 | |
| AE + SA (groups=8) | 3.29M | 193.29M | 84.3 | |
| AE + SA (groups=16) | 3.29M | 193.31M | 84.3 | |
| AE + SE (ratio=8) | 3.31M | 193.62M | 84.0 | |
| AE + SE (ratio=16) | 3.30M | 193.61M | 79.7 | |
| AE + PNA | 4.27M | 489.68M | 24.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).