Submitted:
30 March 2025
Posted:
31 March 2025
Read the latest preprint version here
Abstract
Keywords:
1. Introduction
1.1. Background
1.2. Research Questions
2. RQ1: Opportunities: Applications of Text-to-Image in Education
2.1. Creative Literacy and Visual Storytelling
2.2. Curriculum Development and Instructional Materials
2.3. STEM and Medical Applications
2.4. History and Social Sciences
2.5. Visual Arts and Design Education
2.6. Cross-Linguistic and Multilingual Education
2.7. Special Education and Accessibility
2.8. Informal Learning Environments (Afterschool or Home Education)
2.9. AI Literacy and Critical Thinking
3. RQ2: Challenges: Technical, Ethical, and Pedagogical Limitations
3.1. Technical Challenges and Limitations
3.1.1. Accuracy and Consistency
3.1.2. Prompt Engineering Skills
3.1.3. Resource Constraints
3.1.4. Safety and Content Filtering
3.1.5. Bias as a Technical Limitation
3.2. Ethical Concerns: Bias and Misinformation
3.2.1. Bias in Representation
3.2.2. Misinformation and Authenticity
3.2.3. Harmful or Sensitive Content
3.3. Data Privacy and Intellectual Property
3.3.1. Data Privacy
3.3.2. Intellectual Property (IP) Considerations
3.4. Pedagogical and Practical Challenges
3.4.1. Alignment with Learning Goals
3.4.2. Assessment and Academic Integrity
3.4.3. Classroom Management and Engagement
3.4.4. Teacher Professional Development
3.4.5. Curricular Relevance
4. RQ3: Student Perspectives on Text-to-Image AI
4.1. Heightened Engagement and Visual Discovery
S7: "I never thought I could make pictures with just words! When I typed about the Great Wall of China and saw it appear with all those bricks and mountains, it felt like magic. It made me want to learn about more places so I could make more cool pictures."
S21: "My favorite part was seeing how the pictures changed when I changed my words. Once I wrote about Tokyo at night with all its lights, and it looked amazing! Then I tried Tokyo in daytime, and it was completely different. I could explore the same city in different ways just by changing a few words."
S14: "Before, I only knew Paris had the Eiffel Tower because I saw it in movies. But when I used Copilot, I discovered so many other beautiful buildings and bridges I didn't know about. Now I want to visit Paris someday and see all those places for real."
4.2. Descriptive Language Development
S3: "At first, I just wrote 'Sydney Opera House' and the picture was okay. But then I learned to write 'Sydney Opera House with white sail-shaped roofs glowing in the sunset with harbor waters reflecting orange light.' That picture was amazing! Now I use more describing words in all my writing."
S26: "I had to look up words like 'architecture' and 'metropolitan' to make better pictures. My teacher was surprised when I used 'ornate' to describe buildings in Barcelona. I didn't know these words before, but now I use them all the time because they make my pictures look exactly how I want."
S18: "When the first picture of Rome didn't show the Colosseum how I imagined it, I had to think about what words would make it look ancient and huge. I tried 'massive stone arches' and 'weathered by thousands of years' and it worked! Now I know how to make my stories sound better too."
4.3. Cultural Awareness and Global Perspective
S11: "I chose Kyoto because my favorite cartoon character is from Japan. But I learned so much more! I didn't know about the bamboo forests and old temples there. The pictures showed how Japan keeps its traditional buildings even though it's modern too. That's different from my city where everything looks new."
S29: "When I compared pictures of Cairo and New York, I noticed how buildings look so different because of their history and weather. Cairo has these cool geometric patterns and flat roofs, while New York has super tall glass buildings. It made me think about why cities look the way they do."
S5: "I generated pictures of Venice with all its canals and Rio de Janeiro with mountains right next to the beach. It made me realize how much the land changes how people build cities. Now when I see a new place, I think about how it got shaped by where it is."
4.4. Technological Literacy and Critical Evaluation
S8: "Sometimes Copilot made mistakes, like putting the wrong buildings in a city or making things look weird. Our teacher showed us real photos to compare, and we discussed why AI gets confused. Now I know I should always check if computer-made images are correct."
S17: "I noticed that when I asked for pictures of Dubai, it always showed the really fancy buildings and beaches, not the normal parts where regular people live. It made me think about how pictures can make us believe things that aren't completely true about a place."
S24: "I learned that you have to be really specific or the AI might get confused. Once I asked for a picture of 'Paris' and it showed me Paris, France, but when my friend asked for 'Paris' it showed a different view. We figured out you need to say exactly what parts you want to see, or it just guesses."
4.5. Challenges and Frustrations
S12: "Sometimes it was really annoying when the computer didn't understand what I wanted. I tried to get a picture of Machu Picchu with the right mountains, but it kept looking wrong no matter how I described it. It made me feel like giving up after trying five times."
S19: "The hardest part was when me and my friend used exactly the same words but got different pictures. It didn't seem fair. I spent a long time thinking of good words for my prompt, but his picture still looked better than mine even though he didn't try as hard."
S4: "When we learned about bias, it was confusing. Our teacher showed how asking for 'a typical city street' mostly showed American or European places, not African or Asian ones. It made me wonder if the computer thinks some places are more important than others."
4.6. Collaborative Learning and Peer Inspiration
S2: "When my peer showed his amazing picture of Istanbul with all the mosques and water, I asked him how he did it. He taught me to use words like 'panoramic view' and 'ornate domes' that I didn't know before. We started a competition to see who could make the best city pictures."
S16: "Our group made a collection of night scenes from different cities around the world. Someone would find a cool way to describe lights or reflections, and then we'd all try using that in our own prompts for different cities. It was fun seeing how the same describing words worked differently for Tokyo versus London."
S23: "My partner was really good at thinking of detailed prompts, and I was good at spotting when things looked wrong in the pictures. We made a good team because we could help each other make better city images than we could by ourselves."
4.7. Creativity and Imaginative Exploration
S10: "My favorite part was creating 'what if' cities. I made pictures of 'What if Rio de Janeiro had snowy mountains instead of sunny beaches?' and 'What if New York was built in the desert?' It was fun to imagine how cities would look different if they were in different places."
S27: "I learned how cities change over time, so I tried making pictures of 'Tokyo 100 years ago' and 'Tokyo 100 years in the future.' It was amazing to see how the same place could look so different. The old Tokyo had wooden buildings and the future Tokyo had flying cars and super tall towers."
S13: "After seeing all the normal city pictures, I wanted to try something different. I asked for 'Venice during Carnival with people wearing masks and colorful costumes' and it made an amazing picture! Then everyone started thinking of special events in different cities to make their pictures more interesting."
5. Conclusion
References
- W. C. Choi, I. C. Choi, and C. I. Chang, ‘The Impact of Artificial Intelligence on Education: The Applications, Advantages, Challenges and Researchers’ Perspective’, Preprints.org, 2025.
- W. C. Choi and C. I. Chang, ‘A Survey of Techniques, Design, Applications, Challenges, and Student Perspective of Chatbot-Based Learning Tutoring System Supporting Students to Learn in Education’, 2025, Preprints.org.
- W. C. Choi and C. I. Chang, ‘Advantages and Limitations of Open-Source versus Commercial Large Language Models (LLMs): A Comparative Study of DeepSeek and OpenAI’s ChatGPT’, 2025, Preprints.org.
- C. Wan Chong and C. Chi In, ‘A Survey of Techniques, Key Components, Strategies, Challenges, and Student Perspectives on Prompt Engineering for Large Language Models (LLMs) in Education’, 2025, Preprints.org.
- Adeshola and A. P. Adepoju, ‘The opportunities and challenges of ChatGPT in education’, Interactive Learning Environments, vol. 32, no. 10, pp. 6159–6172, 2024.
- W. C. Choi, I. C. Choi, C. I. Chang, and L. C. Lam, ‘Comparison of Claude (Sonnet and Opus) and ChatGPT (GPT-4, GPT-4o, GPT-o1) in Analyzing Educational Image-based Questions from Block-Based Programming Assessments’, in 2025 14th International Conference on Information and Education Technology (ICIET), IEEE, 2025.
- Md. M. Rahman and Y. Watanobe, ‘ChatGPT for Education and Research: Opportunities, Threats, and Strategies’, APPLIED SCIENCES-BASEL, vol. 13, no. 9. MDPI, ST ALBAN-ANLAGE 66, CH-4052 BASEL, SWITZERLAND, May 08, 2023. [CrossRef]
- C. I. Chang, W. C. Choi, and I. C. Choi, ‘A Systematic Literature Review of the Opportunities and Advantages for AIGC (OpenAI ChatGPT, Copilot, Codex) in Programming Course’, in Proceedings of the 2024 7th International Conference on Big Data and Education, 2024.
- C. I. Chang, W. C. Choi, I. C. Choi, and H. Lei, ‘A Systematic Literature Review of the Practical Applications of Artificial Intelligence-Generated Content (AIGC) Using OpenAI ChatGPT, Copilot, and Codex in Programming Education’, in Proceedings of the 2024 8th International Conference on Education and E-Learning, 2024.
- C. I. Chang, W. C. Choi, and I. C. Choi, ‘Challenges and Limitations of Using Artificial Intelligence Generated Content (AIGC) with ChatGPT in Programming Curriculum: A Systematic Literature Review’, in Proceedings of the 2024 7th Artificial Intelligence and Cloud Computing Conference, 2024.
- OpenAI, Introducing 4o Image Generation, Mar. 25, 2025. [Online]. Available: https://openai.com/index/introducing-4o-image-generation/.
- M. Apiola, H. Vartiainen, and M. Tedre, ‘First Year CS Students Exploring And Identifying Biases and Social Injustices in Text-to-Image Generative AI’, in Proceedings of the 2024 on Innovation and Technology in Computer Science Education V. 1, 2024, pp. 485–491.
- T. A. Ringvold, I. Strand, P. Haakonsen, and K. S. Strand, ‘The AI generative text-to-image creative learning process: An art and design educational perspective’, Design and Technology Education: An International Journal, vol. 29, no. 2, pp. 359–379, 2024.
- H.-K. Ko, G. Park, H. Jeon, J. Jo, J. Kim, and J. Seo, ‘Large-scale text-to-image generation models for visual artists’ creative works’, in Proceedings of the 28th international conference on intelligent user interfaces, 2023, pp. 919–933.
- X. Shuai, H. Ding, X. Ma, R. Tu, Y.-G. Jiang, and D. Tao, ‘A survey of multimodal-guided image editing with text-to-image diffusion models’, arXiv preprint arXiv:2406.14555, 2024.
- J. Xu et al., ‘Imagereward: Learning and evaluating human preferences for text-to-image generation’, Advances in Neural Information Processing Systems, vol. 36, pp. 15903–15935, 2023.
- J. Oppenlaender, ‘The creativity of text-to-image generation’, in Proceedings of the 25th international academic mindtrek conference, 2022, pp. 192–202.
- N. Montenegro, ‘Integrative analysis of Text-to-Image AI systems in architectural design education: pedagogical innovations and creative design implications’, Journal of Architecture and Urbanism, vol. 48, no. 2, pp. 109–124, 2024. [CrossRef]
- S. Ali, P. Ravi, K. Moore, H. Abelson, and C. Breazeal, ‘A picture is worth a thousand words: Co-designing text-to-image generation learning materials for k-12 with educators’, in Proceedings of the AAAI Conference on Artificial Intelligence, 2024, pp. 23260–23267. [CrossRef]
- T. A. Ringvold, I. Strand, P. Haakonsen, and K. S. Strand, ‘AI text-to-image generation in Art and design teacher education: a creative tool or a hindrance to future creativity?’, in The 40th International Pupils’ Attitudes Towards Technology Conference Proceedings 2023, 2023.
- C. S. Caires, G. Estadieu, and S. Olga Ng Ka Man, ‘Design thinking methodology and text-to-image artificial intelligence: A case study in the context of furniture design education’, in Perspectives on Design and Digital Communication IV: Research, Innovations and Best Practices, Springer, 2023, pp. 113–134.
- Y. Hwang and Y. Wu, ‘Graphic Design Education in the Era of Text-to-Image Generation: Transitioning to Contents Creator’, International Journal of Art & Design Education, vol. 44, no. 1, pp. 239–253, 2025.
- G. P. Noel, ‘Evaluating AI-powered text-to-image generators for anatomical illustration: A comparative study’, Anatomical sciences education, vol. 17, no. 5, pp. 979–983, 2024.
- N. Dehouche and K. Dehouche, ‘What’s in a text-to-image prompt? The potential of stable diffusion in visual arts education’, Heliyon, vol. 9, no. 6, 2023. [CrossRef]
- H. Vartiainen and M. Tedre, ‘Using artificial intelligence in craft education: crafting with text-to-image generative models’, DIGITAL CREATIVITY, vol. 34, no. 1. ROUTLEDGE JOURNALS, TAYLOR & FRANCIS LTD, 2-4 PARK SQUARE, MILTON PARK, ABINGDON OX14 4RN, OXON, ENGLAND, pp. 1–21, Jan. 02, 2023. [CrossRef]
- F. Bie et al., ‘Renaissance: A survey into ai text-to-image generation in the era of large model’, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024. [CrossRef]
- S. Hartwig et al., ‘A Survey on Quality Metrics for Text-to-Image Generation’, arXiv preprint arXiv:2403.11821, 2024.
- C. Zhang et al., ‘A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need?’, arXiv preprint arXiv:2303.11717, 2023.
- M. Liu et al., ‘Llm4gen: Leveraging semantic representation of llms for text-to-image generation’, arXiv preprint arXiv:2407.00737, 2024.











Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).