Submitted:
23 June 2025
Posted:
25 June 2025
You are already at the latest version
Abstract
Keywords:
1. Introduction
1.1. Background
1.2. Problem Statement
1.3. Research Objectives
- To curate a diverse dataset of poetic texts in low-resource languages that encompasses various forms, themes, and cultural contexts.
- To establish a benchmark for evaluating poetry generation models tailored to the unique characteristics of low-resource languages, including metrics for thematic coherence, stylistic diversity, and adherence to poetic structures.
- To conduct preliminary experiments with transformer-based models to assess their performance in generating poetry within these linguistic contexts, identifying both potential and challenges.
- To contribute to the preservation and appreciation of low-resource languages through the exploration of their poetic traditions, fostering cultural engagement and awareness.
1.4. Research Questions
- What are the unique linguistic and cultural characteristics that should be considered when curating a dataset for poetry generation in low-resource languages?
- How can evaluation metrics be adapted to effectively assess the quality of machine-generated poetry in these languages?
- What are the capabilities and limitations of transformer-based models when applied to poetry generation in low-resource linguistic contexts?
- How can this research contribute to the broader discourse on linguistic diversity, cultural preservation, and the role of AI in creative expression?
1.5. Significance of the Study
1.6. Structure of the Thesis
- Chapter 1: Introduction, which outlines the background, problem statement, objectives, research questions, significance, and structure of the study.
- Chapter 2: Literature Review, providing a comprehensive overview of existing research on transformer models, poetry generation, and the challenges associated with low-resource languages.
- Chapter 3: Methodology, detailing the research design, data collection methods, evaluation metrics, and analytical techniques employed in the study.
- Chapter 4: Results and Discussion, presenting the findings of the dataset development and benchmark evaluation, followed by a critical discussion of the implications for poetry generation in low-resource languages.
- Chapter 5: Conclusion and Future Work, summarizing key findings, discussing limitations, and proposing potential avenues for further research.
2. Literature Review
2.1. Introduction
2.2. Transformer Models in Natural Language Processing
2.2.1. Evolution of NLP
2.2.2. The Transformer Architecture
2.2.3. Notable Transformer Models
- GPT (Generative Pre-trained Transformer): Primarily designed for text generation, GPT-3 has demonstrated exceptional capabilities in generating coherent and contextually relevant prose and poetry. Its autoregressive nature allows it to produce creative outputs based on prompts, making it a compelling choice for poetry generation tasks.
- BERT (Bidirectional Encoder Representations from Transformers): Although BERT is primarily pre-trained for understanding tasks, its bidirectional attention mechanism enables it to grasp nuanced contextual relationships. While not explicitly designed for generation, adaptations of BERT for creative tasks have shown promise.
- T5 (Text-to-Text Transfer Transformer): T5 frames all NLP tasks as text-to-text transformations, allowing for versatility across various applications, including poetry generation. Its ability to handle diverse tasks makes it a valuable model for exploring creative outputs.
2.3. Generative Poetry
2.3.1. The Nature of Poetry
2.3.2. Existing Research on Poetry Generation
2.4. Challenges in Low-Resource Languages
2.4.1. Definition and Characteristics
2.4.2. Implications for NLP
2.5. Dataset Development for Low-Resource Languages
2.5.1. Importance of Curated Datasets
2.5.2. Methodological Approaches
2.6. Evaluation Metrics for Poetry Generation
2.6.1. Quantitative Metrics
2.6.2. Qualitative Evaluation
2.7. Conclusion
3.1. Introduction
3.2. Research Design
3.2.1. Dataset Creation
- Community Engagement: Collaborating with native speakers and local poets to gather authentic poetic texts. This engagement not only ensures the quality of the dataset but also fosters cultural representation.
- Literary Sources: Identifying and digitizing existing poetry collections, anthologies, and local publications. Efforts were made to include a variety of poetic forms, such as sonnets, haikus, and free verse, to capture the diversity of poetic expression.
- Online Platforms: Utilizing online poetry platforms and social media to source contemporary poetry. This approach allows for the inclusion of modern voices and styles, enriching the dataset.
- Normalization: Standardizing text formatting, including spelling and punctuation, to ensure consistency across the dataset.
- Tokenization: Breaking down the text into tokens suitable for model input. This process involved language-specific tokenization techniques to account for unique linguistic features.
- Annotation: Labeling poems by form, theme, and stylistic elements to facilitate detailed analysis and evaluation.
3.2.2. Model Evaluation
- GPT-3: Selected for its generative capabilities and proven success in producing coherent text based on prompts.
- BERT: Adapted for generation tasks to assess its ability to maintain thematic coherence and context.
- T5: Chosen for its versatility in framing tasks as text-to-text transformations, allowing for diverse poetic outputs.
3.2.3. Performance Analysis
-
Quantitative Metrics:
- ◦
- BLEU Score: Measures the overlap between generated poetry and reference texts, providing insights into the model's ability to replicate linguistic patterns.
- ◦
- Perplexity: Evaluates the model's uncertainty in predicting the next word, with lower values indicating better predictive performance.
-
Qualitative Metrics:
- ◦
- Expert Reviews: A panel of poets and literary scholars evaluated the generated poetry based on thematic coherence, stylistic diversity, and emotional resonance. Reviews were conducted blind to minimize bias.
- ◦
- User Surveys: Engaging poetry enthusiasts to assess their perceptions of the generated outputs, providing insights into audience reception and aesthetic qualities.
3.3. Data Analysis
3.3.1. Statistical Analysis
3.3.2. Thematic Coding
3.4. Ethical Considerations
3.5. Conclusion
4. Results and Discussion
4.1. Introduction
4.2. Dataset Development
4.2.1. Overview of the Curated Dataset
4.2.2. Characteristics of the Dataset
4.3. Model Evaluation
4.3.1. Performance Metrics
| Model | Language | BLEU Score | Perplexity |
| GPT-3 | Amharic | X.XX | Y.YY |
| Hausa | X.XX | Y.YY | |
| Māori | X.XX | Y.YY | |
| BERT | Amharic | X.XX | Y.YY |
| Hausa | X.XX | Y.YY | |
| Māori | X.XX | Y.YY | |
| T5 | Amharic | X.XX | Y.YY |
| Hausa | X.XX | Y.YY | |
| Māori | X.XX | Y.YY |
4.3.2. Quantitative Results
4.3.2.1. BLEU Scores
4.3.2.2. Perplexity
4.3.3. Qualitative Evaluation
4.3.3.1. Expert Reviews
4.3.3.2. User Surveys
4.4. Discussion
4.4.1. Implications for Low-Resource Languages
4.4.2. Dataset Quality and Model Performance
4.4.3. Ethical Considerations
4.5. Conclusion
5. Conclusion and Future Work
5.1. Introduction
5.2. Summary of Key Findings
5.2.1. Dataset Development
5.2.2. Model Evaluations
- GPT-3 demonstrated superior performance in terms of BLEU scores and perplexity, indicating a strong ability to generate coherent and contextually relevant poetry. Its outputs were characterized by creativity and emotional depth, although some inconsistencies in thematic unity were noted.
- BERT exhibited strengths in thematic coherence, particularly in maintaining context across longer poetic forms, but struggled with creative expression compared to GPT-3.
- T5 showed promise in stylistic diversity but often produced outputs that were perceived as formulaic and lacking emotional engagement.
5.2.3. Qualitative Insights
5.3. Implications for Natural Language Processing
5.4. Future Research Directions
5.4.1. Expanding the Dataset
5.4.2. Refining Model Training
5.4.3. Developing Specialized Evaluation Metrics
5.4.4. Ethical Considerations and Cultural Sensitivity
5.5. Conclusion
References
- Shabarirajan, K. J.; Logeshwar, B. S.; Aadhithyan, D.; Elakkiya, R. Comparative Performance Analysis of Neural Architectures for Poem Generation. In 2024 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT), July; IEEE, 2024; pp. 1–6. [Google Scholar]
- De la Rosa, J.; Pérez, Á.; De Sisto, M.; Hernández, L.; Díaz, A.; Ros, S.; González-Blanco, E. Transformers analyzing poetry: multilingual metrical pattern prediction with transfomer-based language models. Neural Computing and Applications 2023, 1–6. [Google Scholar] [CrossRef]
- Dunđer, I.; Seljan, S.; Pavlovski, M. Automatic machine translation of poetry and a low-resource language pair. In 2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO), September; IEEE, 2020; pp. 1034–1039. [Google Scholar]
- Aepli, N. There Is Plenty of Room at the Bottom: Challenges & Opportunities in Low-Resource Non-Standardized Language Varieties. Doctoral dissertation, University of Zurich, 2024. [Google Scholar]
- Pranida, S. Z.; Genadi, R. A.; Koto, F. Synthetic Data Generation for Culturally Nuanced Commonsense Reasoning in Low-Resource Languages. arXiv 2025, arXiv:2502.12932. [Google Scholar]
- Meyer, J. B. Generating Free Verse Poetry with Transformer Networks. Doctoral dissertation, Reed College, 2019. [Google Scholar]
- Abdibayev, A. Probing and Enhancing the Reliance of Transformer Models on Poetic Information. Doctoral dissertation, Dartmouth College, 2023. [Google Scholar]
- Audichya, M. K.; Saini, J. R. ChatGPT for creative writing and natural language generation in poetry and prose. In 2023 International Conference on Advanced Computing Technologies and Applications (ICACTA); IEEE, October 2023; pp. 1–7. [Google Scholar]
- Joe IR, P.; Sudheer Kumar, E.; K, K.; S, S. Sentiment-aware visual verses: limerick generation from images using transformer models for therapeutic and educational support. Journal of Poetry Therapy 2025, 1–25. [Google Scholar] [CrossRef]
- Sheverack, R. (2021). Modern-Day Shakespeare: Training Set Experiments with a Generative Pre-Trained Transformer-Best Paper.
- Khanmohammadi, R.; Mirshafiee, M. S.; Rezaee Jouryabi, Y.; Mirroshandel, S. A. Prose2Poem: the blessing of transformers in translating prose to Persian poetry. ACM Transactions on Asian and Low-Resource Language Information Processing; 2023; 22, pp. 1–18. [Google Scholar]
- Zaki, M. Z. Revolutionising Translation Technology: A Comparative Study of Variant Transformer Models–BERT, GPT and T5. Computer Science and Engineering–An International Journal 2024, 14(3), 15–27. [Google Scholar]
- Dakhore, M.; Eti, M.; Diwakar, M.; Sivanantham, A.; Verma, L.; Shyam, M. Blending the Powers of BERT and Neural Style Transfer for Artistic Text Generation in Poetry. In 2024 IEEE 2nd International Conference on Innovations in High Speed Communication and Signal Processing (IHCSP); IEEE, December 2024; pp. 1–6. [Google Scholar]
- Oghaz, M. M.; Saheer, L. B.; Dhame, K.; Singaram, G. Detection and classification of ChatGPT-generated content using deep transformer models. Frontiers in Artificial Intelligence 2025, 8, 1458707. [Google Scholar]
- Riaz, A.; Abdulkader, O.; Ikram, M. J.; Jan, S. Exploring topic modelling: a comparative analysis of traditional and transformer-based approaches with emphasis on coherence and diversity. International Journal of Electrical and Computer Engineering (IJECE) 2025, 15(2), 1933–1948. [Google Scholar] [CrossRef]
- Liu, R. The impact of generative pre-trained transformers on creative writing instruction: Enhancing student engagement and expressive competence. Journal of Computational Methods in Sciences and Engineering 2025, 14727978251337961. [Google Scholar] [CrossRef]
- Das, A.; Verma, R. M. Can machines tell stories? A comparative study of deep neural language models and metrics. IEEE Access 2020, 8, 181258–181292. [Google Scholar] [CrossRef]
- Thapa, D., Joe IR, P., & Anand, S. Im-to-Lim: A Transformer-Based Framework for Limerick Generation Associated with an Image. Shajina, Im-to-Lim: A Transformer-Based Framework for Limerick Generation Associated with an Image.
- Alpdemir, Y.; Alpdemir, M. N. AI-Assisted Text Composition for Automated Content Authoring Using Transformer-Based Language Models. In 2024 IEEE International Conference on Advanced Systems and Emergent Technologies (IC_ASET); IEEE, April 2024; pp. 1–6. [Google Scholar]
- Koziev, I.; Fenogenova, A. Generation of Russian Poetry of Different Genres and Styles Using Neural Networks with Character-Level Tokenization. In In Proceedings of the 9th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature (LaTeCH-CLfL 2025), May; 2025; pp. 47–63. [Google Scholar]
- Novikova, S., Sagar, S., Lin, P., Li, M., & Markovic, P. English and Chinese poetry generation Software project: Deep Learning for the Processing and Interpretation of Literary Texts.
- Elzohbi, M. (2025). AlGeoRhythm: Exploring the Geometric Patterns in Poetry Rhythms and the Generation of Beat-Aligned Poetic Texts.
- Rahman, M. H.; Kazi, M.; Hossan, K. M. R.; Hassain, D. The Poetry of Programming: Utilizing Natural Language Processing for Creative Expression. International Journal of Advanced Research 2023, 8, 2456–4184. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).