Submitted:
09 February 2025
Posted:
10 February 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
2. Literature Review
2.1. Low Resources Language
2.2. Machine Translation System
2.3. Hakka Corpus
3. Hybrid Machine Translation
3.1. Objectives of the Hybrid Machine Translation System
3.2. Preprocessing
3.3. System Development Process and Architecture
3.4. Neural Machine Translation (NMT)
3.5. Phrase-Based Machine Translation (PBMT)
3.6. Hybrid AI-Driven Translation System Development
4. System Evaluation
4.1. Performance of Phrase-Based Machine Translation
4.2. Neural Machine Translation with Transformers
4.3. Hybrid Artificial Intelligence Model
5. Conclusion
Author Contributions
Acknowledgments
References
- Sánchez-Cartagena, V.M.; Forcada, M.L.; Sánchez-Martínez, F. A multi-source approach for Breton–French hybrid machine translation. In Proceedings of the 22nd Annual Conference of the European Association for Machine Translation; 2020. [Google Scholar]
- Ambati, V.; Carbonell, J.G. Proactive learning for building machine translation systems for minority languages. In Proceedings of the NAACL HLT 2009 Workshop on Active Learning for Natural Language Processing; 2009. [Google Scholar]
- Costa-jussà, M.R.; et al. No language left behind: Scaling human-centered machine translation. arXiv 2022, arXiv:2207.04672. [Google Scholar]
- Cronin, M. Altered states: Translation and minority languages. TTR: traduction, terminologie, rédaction 1995, 8, 85–103. [Google Scholar] [CrossRef]
- Kuusi, P.; Kolehmainen, L.; Riionheimo, H. Introduction: Multiple roles of translation in the context of minority languages and revitalisation. trans-kom: Zeitschrift für Translationswissenschaft und Fachkommunikation 2017, 10, 138–163. [Google Scholar]
- Orynycz, P. BLEU Skies for Endangered Language Revitalization: Lemko Rusyn and Ukrainian Neural AI Translation Accuracy Soars. In International Conference on Human-Computer Interaction; Springer, 2023. [Google Scholar]
- Herbig, N.; et al. Integrating Artificial and Human Intelligence for Efficient Translation. arXiv 2019, arXiv:1903.02978. [Google Scholar]
- Hansen, K.L.; Melhus, M.; Høgmo, A.; Lund, E. Ethnic discrimination and bullying in the Sami and non-Sami populations in Norway: the SAMINOR study. Int. J. Circumpolar Heal. 2008, 67, 99–115. [Google Scholar] [CrossRef]
- Kantamneni, N. The impact of the COVID-19 pandemic on marginalized populations in the United States: A research agenda. J. Vocat. Behav. 2020, 119, 103439–103439. [Google Scholar] [CrossRef] [PubMed]
- Sah, P. Linguistic Diversity and Social Justice: An Introduction of Applied Sociolinguistics. Crit. Inq. Lang. Stud. 2018, 15, 228–230. [Google Scholar] [CrossRef]
- Forcada, M. Open source machine translation: an opportunity for minor languages. In Proceedings of the Workshop “Strategies for developing machine translation for minority languages”, LREC; 2006. [Google Scholar]
- Crossley, S.A. Technological disruption in foreign language teaching: The rise of simultaneous machine translation. Lang. Teach. 2018, 51, 541–552. [Google Scholar] [CrossRef]
- Somers, H. Machine translation and minority languages. In Proceedings of Translating and the Computer 19 1997. [Google Scholar]
- Kenny, D.; Moorkens, J.; Carmo, F.D. Fair MT: Towards ethical, sustainable machine translation. Translation Spaces 2020, 9, 1–11. [Google Scholar] [CrossRef]
- Karakanta, A.; Dehdari, J.; van Genabith, J. Neural machine translation for low-resource languages without parallel corpora. Mach. Transl. 2018, 32, 167–189. [Google Scholar] [CrossRef]
- Goyal, V.; Lehal, G.S. Advances in machine translation systems. Language In India 2009, 9, 138–150. [Google Scholar]
- Awadalla, H.H. Bringing low-resource languages and spoken dialects into play with Semi-Supervised Universal Neural Machine Translation. Microsoft Research Blog 2018, 2018. [Google Scholar]
- Han, S.-H.; Kim, K.W.; Kim, S.; Youn, Y.C. Artificial Neural Network: Understanding the Basic Concepts without Mathematics. Dement. Neurocognitive Disord. 2018, 17, 83–89. [Google Scholar] [CrossRef] [PubMed]
- Vanchurin, V. Toward a theory of machine learning. Machine Learning: Science and Technology 2021, 2, 035012. [Google Scholar] [CrossRef]
- Lote, S.; B, P.K.; Patrer, D. Neural networks for machine learning applications. World J. Adv. Res. Rev. 2020, 6, 270–282. [Google Scholar] [CrossRef]
- Bahdanau, D. Neural machine translation by jointly learning to align and translate. arXiv 2014, arXiv:1409.0473. [Google Scholar]
- Bisong, E. Recurrent Neural Networks (RNNs). Building Machine Learning and Deep Learning Models on Google Cloud Platform: A Comprehensive Guide for Beginners. 2019; 443–473. [Google Scholar]
- Lipton, Z.C. A Critical Review of Recurrent Neural Networks for Sequence Learning. arXiv 2015, arXiv:1506.00019. [Google Scholar]
- Lee, H.; Song, J. Understanding recurrent neural network for texts using English-Korean corpora. Commun. Stat. Appl. Methods 2020, 27, 313–326. [Google Scholar] [CrossRef]
- Agrawal, R.; Sharma, D.M. Experiments on different recurrent neural networks for English-Hindi machine translation. Computer Science and Information Technology (CS & IT) 2017, 63–74. [Google Scholar]
- Hu, J. Neural Machine Translation (NMT): Deep learning approaches through Neural Network Models. Appl. Comput. Eng. 2024, 82, 93–99. [Google Scholar] [CrossRef]
- Cho, K.; Van Merrienboer, B.; Bahdanau, D.; Bengio, Y. On the properties of neural machine translation: Encoder-decoder approaches. arXiv 2014, arXiv:1409.1259. [Google Scholar] [CrossRef]
- Karita, S.; et al. A comparative study on transformer vs rnn in speech applications. In 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU); IEEE, 2019. [Google Scholar]
- Raganato, A.; Tiedemann, J. An analysis of encoder representations in transformer-based machine translation. In Proceedings of the 2018 EMNLP Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, The Association for Computational Linguistics; 2018. [Google Scholar]
- Zhang, F. Application of data storage and information search in english translation corpus. Wirel. Networks 2021, 1–11. [Google Scholar] [CrossRef]
- Harrat, S.; Meftouh, K.; Smaïli, K. Creating parallel Arabic dialect corpus: pitfalls to avoid. In 18th International Conference on Computational Linguistics and Intelligent Text Processing (CICLING); Budapest: Hungary, 2017. [Google Scholar]
- Haddow, B.; Bawden, R.; Barone, A.V.M.; Helcl, J.; Birch, A. Survey of Low-Resource Machine Translation. Comput. Linguistics 2022, 48, 673–732. [Google Scholar] [CrossRef]
- Horbačauskienė, J.; Kasperavičienė, R.; Petronienė, S. Issues of Culture Specific Item Translation in Subtitling. Procedia - Soc. Behav. Sci. 2016, 231, 223–228. [Google Scholar] [CrossRef]
- Hasan, T.; et al. Not low-resource anymore: Aligner ensembling, batch filtering, and new datasets for Bengali-English machine translation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online: Association for Computational Linguistics; 2020. [Google Scholar]
- Hung, Y.-H.; Huang, Y.-C. A Preliminary Study on Mandarin-Hakka neural machine translation using small-sized data. In Proceedings of the 34th Conference on Computational Linguistics and Speech Processing (ROCLING 2022); 2022. [Google Scholar]
- Papineni, K.; et al. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting of the Association for Computational Linguistics; 2002. [Google Scholar]
- Lavie, A.; Denkowski, M.J. The Meteor metric for automatic evaluation of machine translation. Mach. Transl. 2009, 23, 105–115. [Google Scholar] [CrossRef]
- Snover, M.G.; Madnani, N.; Dorr, B.; Schwartz, R. TER-Plus: paraphrase, semantic, and alignment enhancements to Translation Edit Rate. Mach. Transl. 2009, 23, 117–127. [Google Scholar] [CrossRef]
- Vaswani, A.; et al. Attention is all you need. Advances in neural information processing systems 2017, 30, 5998–6008. [Google Scholar]
- TensorFlow. Neural machine translation with a Transformer and Keras, 2024. Available online: https://www.tensorflow.org/text/tutorials/transformer.
- Wang, Q.; et al. Learning deep transformer models for machine translation. arXiv 2019, arXiv:1906.01787, 2019. [Google Scholar]
- Vaswani, A.; et al. Tensor2tensor for neural machine translation. arXiv 2018, arXiv:1803.07416. [Google Scholar]
- FacebookAI. Transformer (NMT) Transformer models for English-French and English-German translation, 2024. Available online: https://pytorch.org/hub/pytorch_fairseq_translation/. (accessed on 8 June 2024).
- Gibadullin, I.; et al. A survey of methods to leverage monolingual data in low-resource neural machine translation. arXiv 2019, arXiv:1910.00373. [Google Scholar]
- Civico, M. The Dynamics of Language Minorities: Evidence from an Agent-Based Model of Language Contact. J. Artif. Soc. Soc. Simul. 2019, 22. [Google Scholar] [CrossRef]



Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).