Submitted:
30 August 2024
Posted:
03 September 2024
You are already at the latest version
Abstract
Keywords:
1. Introduction
- We implemented ResNet model to extract the features of every individual character in the textual image. The BiLSTM with CTC model is employed for the purpose of sequence modeling. Ultimately, a language model (LM) is employed during the post-processing phase to improve the forecasted outcome derived from the classification phase.
- We performed testing of the model on two distinct datasets: KHATT and AHTID/MW datasets. The utilization of several datasets underscores the model's capacity to extrapolate across diverse manifestations of Arabic handwriting.
2. Literature Review
3. Materials and Methods
3.1. Datasets Description
3.1.1. KHATT Database
3.1.2. AHTID/MW Database
3.2. Preprocessing
3.3. Feature Extraction
3.4. Classification
3.5. Language Model
4. Results and Discussions
4.1. System Settings and Parameters
4.2. Performance Evaluation
4.3. Experimental Results
5. Conclusions
5.1. Limitations and Future Works
Author Contributions
Funding
Conflicts of Interest
References
- Eberhard, D.M.; Simons, G.F.; Fennig, C.D. Gujarati. Ethnologue: Languages of the world, 22nd edn. Dallas: SIL International. 2019.
- Nashif, M.H.H.; Miah, M.B.A.; Habib, A.; Moulik, A.C.; Islam, M.S.; Zakareya, M.; Ullah, A.; Rahman, M.A.; Al Hasan, M. Handwritten numeric and alphabetic character recognition and signature verification using neural network. Journal of Information Security 2018, 9, 209. [Google Scholar] [CrossRef]
- El-Dabi, S.S.; Ramsis, R.; Kamel, A. Arabic character recognition system: a statistical approach for recognizing cursive typewritten text. Pattern recognition 1990, 23, 485–495. [Google Scholar] [CrossRef]
- Anis, M.; Maalej, R.; Elleuch, M. Recent advances of ML and DL approaches for Arabic handwriting recognition: A review. International Journal of Hybrid Intelligent Systems 2023, 19, 1–18. [Google Scholar] [CrossRef]
- AlKhateeb, J.H.; Jiang, J.; Ren, J.; Ipson, S. Component-based segmentation of words from handwritten Arabic text. International Journal of Computer Systems Science and Engineering 2009, 5. [Google Scholar]
- Nashwan, F.; Rashwan, M.A.; Al-Barhamtoshy, H.M.; Abdou, S.M.; Moussa, A.M. A holistic technique for an Arabic OCR system. Journal of Imaging 2018, 4, 6. [Google Scholar] [CrossRef]
- Boufenar, C.; Kerboua, A.; Batouche, M. Investigation on deep learning for off-line handwritten Arabic character recognition. Cognitive Systems Research 2018, 50, 180–195. [Google Scholar] [CrossRef]
- Alrobah, N.; Albahli, S. Arabic Handwritten Recognition Using Deep Learning: A Survey. Arabian Journal for Science and Engineering 2022, 47, 9943–9963. [Google Scholar] [CrossRef]
- Berriche, L.; Alqahtani, A.; RekikR, S. Hybrid Arabic handwritten character segmentation using CNN and graph theory algorithm. Journal of King Saud University - Computer and Information Sciences 2024, 36, 101872. [Google Scholar] [CrossRef]
- Mosbah, L.; Moalla, I.; Hamdani, T.M.; Neji, B.; Beyrouthy, T.; Alimi, A.M. ADOCRNet: A Deep Learning OCR for Arabic Documents Recognition. IEEE Access 2024, 1–1. [Google Scholar] [CrossRef]
- Mahdi, M.G.; Sleem, A.; Elhenawy, I. Deep Learning Algorithms for Arabic Optical Character Recognition: A Survey. Multicriteria Algorithms with Applications 2024, 2, 65–79. [Google Scholar] [CrossRef]
- Mahmoud, S.A.; Ahmad, I.; Alshayeb, M.; Al-Khatib, W.G.; Parvez, M.T.; Fink, G.A.; Märgner, V.; El Abed, H. Khatt: Arabic offline handwritten text database. In Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition; 2012; pp. 449–454. [Google Scholar]
- Mezghani, A.; Kanoun, S.; Khemakhem, M.; El Abed, H. A database for arabic handwritten text image recognition and writer identification. In Proceedings of the 2012 international conference on frontiers in handwriting recognition; 2012; pp. 399–402. [Google Scholar]
- Mamouni El, M. An Effective Combination of Convolutional Neural Network and Support Vector Machine Classifier for Arabic Handwritten Recognition. Automatic Control and Computer Sciences 2023, 57, 267–275. [Google Scholar] [CrossRef]
- Alheraki, M.; Al-Matham, R.; Al-Khalifa, H. Handwritten Arabic Character Recognition for Children Writing Using Convolutional Neural Network and Stroke Identification. Human-Centric Intelligent Systems 2023, 3, 147–159. [Google Scholar] [CrossRef]
- Elleuch, M.; Maalej, R.; Kherallah, M. A new design based-SVM of the CNN classifier architecture with dropout for offline Arabic handwritten recognition. Procedia Computer Science 2016, 80, 1712–1723. [Google Scholar] [CrossRef]
- Jemni, S.K.; Kessentini, Y.; Kanoun, S.; Ogier, J.-M. Offline Arabic handwriting recognition using BLSTMs combination. In Proceedings of the 2018 13th IAPR International Workshop on Document Analysis Systems (DAS); 2018; pp. 31–36. [Google Scholar]
- BenZeghiba, M.F.; Louradour, J.; Kermorvant, C. Hybrid word/Part-of-Arabic-Word Language Models for arabic text document recognition. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR); 2015; pp. 671–675. [Google Scholar]
- Forney, G.D. The viterbi algorithm. Proceedings of the IEEE 1973, 61, 268–278. [Google Scholar] [CrossRef]
- Stahlberg, F.; Vogel, S. The qcri recognition system for handwritten arabic. In Proceedings of the International Conference on Image Analysis and Processing; 2015; pp. 276–286. [Google Scholar]
- Povey, D.; Zhang, X.; Khudanpur, S. Parallel training of deep neural networks with natural gradient and parameter averaging. arXiv preprint, arXiv:1410.7455 2014.
- Wigington, C.; Stewart, S.; Davis, B.L.; Barrett, W.A.; Price, B.L.; Cohen, S.D. Data Augmentation for Recognition of Handwritten Words and Lines Using a CNN-LSTM Network. 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) 2017, 01, 639–645. [Google Scholar]
- Altwaijry, N.; Al-Turaiki, I. Arabic handwriting recognition system using convolutional neural network. Neural Computing and Applications 2021, 33, 2249–2261. [Google Scholar] [CrossRef]
- El Khayati, M.; Kich, I.; Taouil, Y. CNN-based Methods for Offline Arabic Handwriting Recognition: A Review. Neural Processing Letters 2024, 56, 115. [Google Scholar] [CrossRef]
- AlShehri, H. DeepAHR: a deep neural network approach for recognizing Arabic handwritten recognition. Neural Computing and Applications 2024, 36, 12103–12115. [Google Scholar] [CrossRef]
- Alghyaline, S. Optimised CNN Architectures for Handwritten Arabic Character Recognition. Computers, Materials and Continua 2024, 79, 4905–4924. [Google Scholar] [CrossRef]
- Momeni, S.; BabaAli, B. A transformer-based approach for Arabic offline handwritten text recognition. Signal, Image and Video Processing 2024, 18, 3053–3062. [Google Scholar] [CrossRef]
- Mahmoud, S.A.; Ahmad, I.; Al-Khatib, W.G.; Alshayeb, M.; Parvez, M.T.; Märgner, V.; Fink, G.A. KHATT: An open Arabic offline handwritten text database. Pattern Recognition 2014, 47, 1096–1112. [Google Scholar] [CrossRef]
- Ahmad, R.; Naz, S.; Afzal, M.Z.; Rashid, S.F.; Liwicki, M.; Dengel, A. Khatt: A deep learning benchmark on arabic script. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR); 2017; pp. 10–14. [Google Scholar]
- Kaur, S. Noise types and various removal techniques. International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE) 2015, 4, 226–230. [Google Scholar]
- Soille, P. Morphological image analysis: principles and applications; Springer Science & Business Media: 2013.
- Stahlberg, F.; Vogel, S. Detecting dense foreground stripes in Arabic handwriting for accurate baseline positioning. In Proceedings of the 2015 13th International Conference on Document Analysis and Recognition (ICDAR); 2015; pp. 361–365. [Google Scholar]
- Tavoli, R.; Keyvanpour, M.; Mozaffari, S. Statistical geometric components of straight lines (SGCSL) feature extraction method for offline Arabic/Persian handwritten words recognition. IET Image Processing 2018, 12, 1606–1616. [Google Scholar] [CrossRef]
- Mohamad, R.A.-H.; Likforman-Sulem, L.; Mokbel, C. Combining slanted-frame classifiers for improved HMM-based Arabic handwriting recognition. IEEE transactions on pattern analysis and machine intelligence 2008, 31, 1165–1177. [Google Scholar] [CrossRef]
- Akram, H.; Khalid, S. Using features of local densities, statistics and HMM toolkit (HTK) for offline Arabic handwriting text recognition. Journal of Electrical Systems and Information Technology 2017, 4, 387–396. [Google Scholar]
- Jayech, K.; Mahjoub, M.A.; Amara, N.E.B. Arabic handwriting recognition based on synchronous multi-stream HMM without explicit segmentation. In Proceedings of the International Conference on Hybrid Artificial Intelligence Systems; 2015; pp. 136–145. [Google Scholar]
- Benouareth, A.; Ennaji, A.; Sellami, M. Semi-continuous HMMs with explicit state duration for unconstrained Arabic word modeling and recognition. Pattern Recognition Letters 2008, 29, 1742–1752. [Google Scholar] [CrossRef]
- Almodfer, R.; Xiong, S.; Mudhsh, M.; Duan, P. Multi-column deep neural network for offline Arabic handwriting recognition. In Proceedings of the International Conference on Artificial Neural Networks; 2017; pp. 260–267. [Google Scholar]
- Zhao, Z.-Q.; Zheng, P.; Xu, S.-t.; Wu, X. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 2019, 30, 3212–3232. [Google Scholar] [CrossRef]
- Zeiler, M.D.; Fergus, R. Visualizing and understanding convolutional networks. In Proceedings of the European conference on computer vision; 2014; pp. 818–833. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2016; pp. 770–778.
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. Imagenet classification with deep convolutional neural networks. Advances in neural information processing systems 2012, 25, 1097–1105. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint 2014, arXiv:1409.1556. [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the Proceedings of the IEEE conference on computer vision and pattern recognition, 2015; pp. 1–9.
- Bengio, Y.; Simard, P.; Frasconi, P. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 1994, 5, 157–166. [Google Scholar] [CrossRef]
- Glorot, X.; Bengio, Y. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the Proceedings of the thirteenth international conference on artificial intelligence and statistics, 2010; pp. 249–256.
- Cheng, Z.; Bai, F.; Xu, Y.; Zheng, G.; Pu, S.; Zhou, S. Focusing attention: Towards accurate text recognition in natural images. In Proceedings of the Proceedings of the IEEE international conference on computer vision, 2017; pp. 5076–5084.
- Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks 2005, 18, 602–610. [Google Scholar] [CrossRef]
- Graves, A.; Jaitly, N.; Mohamed, A.-r. Hybrid speech recognition with deep bidirectional LSTM. In Proceedings of the 2013 IEEE workshop on automatic speech recognition and understanding; 2013; pp. 273–278. [Google Scholar]
- Wang, A.; Singh, A.; Michael, J.; Hill, F.; Levy, O.; Bowman, S.R. GLUE: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint, arXiv:1804.07461 2018.
- McCann, B.; Bradbury, J.; Xiong, C.; Socher, R. Learned in translation: Contextualized word vectors. arXiv preprint arXiv:1708.00107, arXiv:1708.00107 2017.
- Chen, T.; Xu, R.; He, Y.; Wang, X. Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Systems with Applications 2017, 72, 221–230. [Google Scholar] [CrossRef]
- Graves, A.; Mohamed, A.-r.; Hinton, G. Speech recognition with deep recurrent neural networks. In Proceedings of the 2013 IEEE international conference on acoustics, speech and signal processing; 2013; pp. 6645–6649. [Google Scholar]
- Graves, A.; Fernández, S.; Gomez, F.; Schmidhuber, J. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proceedings of the Proceedings of the 23rd international conference on Machine learning, 2006; pp. 369–376.
- Shi, B.; Bai, X.; Yao, C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence 2016, 39, 2298–2304. [Google Scholar] [CrossRef] [PubMed]
- Heafield, K. KenLM: Faster and smaller language model queries. In Proceedings of the Proceedings of the sixth workshop on statistical machine translation, 2011; pp. 187–197.
- Zeghiba, M.F.B. Arabic word decomposition techniques for offline Arabic text transcription. In Proceedings of the 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR); 2017; pp. 31–35. [Google Scholar]
- Jemni, S.K.; Kessentini, Y.; Kanoun, S. Out of vocabulary word detection and recovery in Arabic handwritten text recognition. Pattern Recognition 2019, 93, 507–520. [Google Scholar] [CrossRef]











| Layers | Configurations | Output |
| Conv1 | 3 × 3, 1 × 1, 1 × 1,16 | 64 × 1048 |
| 3 × 3, 1 × 1, 1 × 1,32 | ||
| Conv2 | Pool 1: 2 × 2, 2 × 2, 0 × 0 | 32 × 524 |
| 3 × 3, 1 × 1, 1 × 1, 64 | ||
| Conv3 | Pool 2: 2 × 2, 2 × 2, 0 × 0 | 16 × 262 |
| 3 × 3, 1 × 1, 1 × 1, 128 | ||
| Conv4 | Pool 3: 2 × 2, 1 × 2, 1 × 0 | 8 × 263 |
| 3 × 3, 1 × 1, 1 × 1, 256 | ||
| Conv5 | 3 × 263 | |
| 2 × 2, 1 × 2, 1 × 0, 256 | ||
| 2 × 2, 1 × 1, 0 × 0, 256 |
| Model | CER% | WER% |
|---|---|---|
| 2-BiLSTM Layers | 15.8 | 31.6 |
| 3-BiLSTM Layers | 13.2 | 27.31 |
| Model | CER% | WER% |
|---|---|---|
| 2-BiLSTM Layers | 7.4 | 22.79 |
| 3-BiLSTM Layers | 6.6 | 17.42 |
| Reference | Year | Database | CER | WER |
|---|---|---|---|---|
| BenZeghiba et al. [18] | 2015 | KHATT Dataset | - | 31.3% |
| Stahlberg & Vogel [20] | 2015 | KHATT Dataset | - | 30.5% |
| Zeghiba [57] | 2017 | KHATT Dataset | - | 34.3% |
| Jemni et al. [17] | 2018 | KHATT Dataset | 16.27% | 29.13% |
| Jemni et al. [58] | 2019 | AHTID/MW Dataset | - | 18.13% |
| Momeni [27] | 2024 | KHATT Dataset | 18.45% | - |
| Proposed Model | 2024 | KHATT Dataset | 13.2% | 27.31% |
| AHTID/MW Dataset | 6.6% | 17.42% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).