Preprint
Article

This version is not peer-reviewed.

AI-Generated Abstract Expressionism Inspiring Creativity Through Ismail A Mageed's Internal Monologues in Poetic Form

Submitted:

04 January 2025

Posted:

06 January 2025

You are already at the latest version

Abstract

Artificial Intelligence (AI) has revolutionized the creative process, allowing for novel ways of artistic expression. This paper focuses on the intersection of Abstract Expressionism and AI-generated imagery, exploring how poetic prompts inspire unique visual interpretations. By utilizing Leonardo AI with a medium contrast and leveraging the cinematic kino model/preset, the research demonstrates how simple poetic phrases can yield profound visual artworks. The study evaluates the quality, creativity, and emotional resonance of AI-generated art, offering insights into the synergy between human creativity and machine intelligence within an Abstract Expressionism framework. The Leonardo AI is applied to Ismail A Mageed’s internal monologues in poetic form. The paper ends with some potential open problems and concludes with remarks and future research pathways.

Keywords: 
;  ;  ;  ;  

1. Introduction

AI [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] technologies have transcended traditional computational tasks, making significant strides in creative fields such as art, music, and literature. Tools like Leonardo AI [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] have democratized artistic creation, allowing users without technical expertise to participate in generating visually compelling artworks.
This paper explores the potential of AI [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] to translate poetic expressions into Abstract Expressionist art styles, emphasizing the cinematic kino preset. By providing poetic prompts, we examine how effectively AI captures the abstract, emotive qualities of poetry and transforms them into visual art. The aim is to demonstrate how this interaction can inspire creativity, both for artists and non-artists, by bridging the gap between textual and visual mediums.
The advent of AI [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] in creative domains has introduced new opportunities for artistic experimentation. Tools like DALL-E, Midjourney, and Leonardo AI have gained prominence for their ability to produce high-quality, imaginative visual content. However[1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24] , their potential to emulate specific art styles, such as Abstract Expressionism, remains underexplored.
Abstract Expressionism [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24], characterized by its emphasis on emotion and abstraction, presents a unique challenge for AI. Poetry, often rich in metaphor and ambiguity, serves as an ideal medium to test AI’s ability to interpret and visualize abstract concepts. This study aims to bridge the gap between poetic abstraction and visual representation through AI.
In three-dimensional (3D) modeling[1], texturing is important because it adds detail and realism to the models, making them look more lifelike. Instead of using traditional methods, many people are now turning to AI tools like Leonardo Ai and Meshy, which can create textures more efficiently and accurately. This paper compares these two AI tools by examining their performance, differences, and potential uses, aiming to show how they can improve the process of 3D texturing and contribute to advancements in the field.
The phrase "An Old Wooden Stool" refers to a specific prompt used to generate textures in the AI texture generation tools Leonardo and Meshy. When this prompt is applied[1], the textures produced by both tools are compared, particularly focusing on their realism and artistic quality. In this case, Leonardo's output is noted for its better representation of wood grain and colour accuracy, while Meshy's textures tend to show unwanted blue-green shades, affecting the overall tactile quality, as depicted by Figure 1 (c.f., [1]).
In a realistic style, textures created by Meshy tend to have a lower quality, often showing unwanted blue-green shades[1], which detracts from their realism. In contrast, Leonardo AI produces textures that are more realistic[1], featuring consistent wood grain patterns and natural colours. However[1], when it comes to animated styles, both tools perform well, but Meshy better meets the specific artistic needs typical of animation(Figure 2(c.f., [1]).
Another study [2] has explored how AI influences urban design and planning, building on a previous study that examined AI's vision of future cities using a specific algorithm. It compares three advanced AI tools—Leonardo AI [2], Midjourney, and Dall-E—by using prompts related to important urban themes like sustainability and smart cities. The findings highlight each tool's strengths and weaknesses, concluding that while AI can enhance urban design, it works best when combined with traditional planning methods.
Leonardo AI [2] creates images of a future city that prominently feature water throughout the urban landscape. In these images(See Figure 3 (c.f., [2]) , green spaces are mostly found at ground level, and bridges are included to facilitate movement within the city. The images show a crowded skyline filled with tall buildings, and it's easy to see the difference between those that look realistic and those that are purely computer-generated.
In the images created by Midjourney[2], water is a key feature that stands out and captures attention, much like in the images from Leonardo AI. These images(Figure 4 (c.f., [2])) also show signs of water-based transportation, suggesting a futuristic urban design that incorporates waterways. Additionally[2], the green spaces, such as parks or gardens, are mostly located at ground level, which aligns with previous findings about urban planning in these visualizations.
The images generated by Dall-E showcase a futuristic city[2] with advanced features like flying cars, drones, and holographic elements, which align with the idea of a smart city. These images, as depicted in Figure 5 (c.f., [2]), also include symbols representing smart technologies, such as sensors and data centers, indicating their connection to modern urban planning. The evaluation of these images[2], especially when incorporating sustainability, assesses how well each AI tool represents eco-friendly principles and urban design, with Leonardo AI showing more green spaces in its latest visuals.
DALL-E stands out among AI tools for generating images because it can create a wide range of visuals[2], including complex transportation systems and futuristic buildings, showcasing its versatility. It also effectively incorporates sustainability principles in its designs[2], although the quality may vary from image to image. Overall, DALL-E is considered a strong option among the AI platforms analyzed due to its superior performance in generating detailed and innovative urban scenarios.
On another different note [3], this study has evaluated how Leonardo AI can enhance teaching materials for Islamic Religious Education students at UIN Datokarama Palu. By surveying 50 students who used Leonardo AI [3], the researchers found that 85% felt the AI made learning materials more engaging, and 80% reported a better understanding of the content. However[3], the study also highlighted some technical challenges and the need for more training to effectively use the AI tool, suggesting that while AI can greatly improve education, additional support is necessary to address these issues.
The study[3] suggested several ways to enhance the use of Leonardo AI for creating teaching materials. First, institutions should provide more training and support for both students and teachers to help them effectively use the AI tools. Additionally, improving the technology infrastructure, like upgrading hardware and ensuring reliable internet, is essential for the software to work well. Finally, ongoing evaluations of how Leonardo AI is used can help identify improvements and ensure it continues to benefit the learning experience.
Most importantly[18], how multimodal AI, like image generators such as DALL-E, changes our understanding of ekphrasis, which is the description of visual art through text. It argued that in the digital world[18], the relationship between text and images should be seen as interactive and dynamic rather than just a simple representation. The study [18] has also showcased that modern AI blurs the lines between text and images by treating them as similar types of information, allowing for a more integrated approach to creating and understanding art and language.
In discussion of visual poetry[18], a form of art that combines text and images to create meaning. It specifically references a work by German poet Franz Mon from his 1964 collection "non tot," where the arrangement of typewritten lines forms a diamond or sail shape. The upper part of the piece repeats the word "non," while the lower part repeats "tot," with the lines becoming more compressed towards the Center[18], creating a visual effect that enhances the poetic message, as in Figure 6 (c.f., [18]).
Figure 7 (c.f., [18]) presents a visual poem created by Jasmin Meerhoff, a contemporary digital author from Germany. This poem is part of her 2022 collection titled "They Lay," [18] which likely explores themes and ideas through a combination of text and visual elements. Visual poetry blends artistic design with written language, enhancing the reader's experience by engaging both visual and literary senses.
Based on [18], this identifies how the relationship between text and image has evolved in the digital age, particularly with the rise of AI and machine learning (ML), through introducing the concept of "operative ekphrasis," where texts don't just describe images but actively create them through computational processes. More potentially[18], it was argued that there are two ways to understand this interaction—one focusing on practical use (pragmatics) and the other on meaning (semantics)—and suggests that AI can encode a form of meaning, even if it's not as rich as human understanding.
The schematic flow of the current study reads:
Preprints 145172 i001

2. Methodology

Resuming the current exposition, the methodology is introduced

2.1. Research Design

The research employs a qualitative approach to explore the interplay between poetry and AI-generated art within the Abstract Expressionism style[25,26,27,28,29,30,31,32,33,34,35,36,37,38]. Poetic prompts were provided to Leonardo AI, and the resulting images were analysed based on their fidelity to the text, emotional resonance, and alignment with Abstract Expressionist principles. In their study[25], researchers conducted two experiments with a total of 693 participants to see if people view paintings created by AI robots as art and whether they consider robots to be artists. They looked at three factors: whether the creator was a robot or a human[25], whether the painting was made intentionally or accidentally, and whether the painting was abstract or representational. The results showed that while people generally accepted both robot and human paintings as art, they were less likely to see robots as artists because they found it harder to believe that robots had artistic intentions.
Many scholars[25] believed that the definition of art relies heavily on the creator's intentions and mental states. For example[25], Jerrold Levinson's definition states that an object is considered art if it was intentionally created to be regarded as such. This perspective raises questions about whether robot-created objects can be considered art, since robots lack intentions and emotions, which are often seen as essential for artistic creation.
The study[25] found that people are more likely to consider a painting created by a human as art compared to one made by a robot, although the difference is small. Participants also viewed humans as artists much more than robots[25], indicating a significant bias towards human creativity. Additionally, people were more willing to attribute mental states like intention and desire to humans than to robots, which suggests that perceived mental qualities influence how we judge artistic creation.
The mean ratings (Figure 8 (c.f., [25])) refer to the average scores given by participants when evaluating different aspects of art, such as the art itself, the artist's intention, and the audience's desire and belief about the artwork. These ratings were compared between two types of agents: artificial intelligence (AI) and human artists[25], as well as between two types of behaviours: intentional (where the artist meant to create something specific) and accidental (where the creation was not planned). This analysis helps researchers understand how people perceive art differently depending on whether it was created by AI or humans and the nature of the artistic behavior.
AI refers to the ability of machines to exhibit intelligence like humans[26], allowing them to perceive their environment and take actions to achieve specific goals. As technology has advanced, AI has become integrated into everyday life and various fields, including art, leading to debates about its impact on creativity and human expression.
The authors [26] have explored how changing ideas about art are connected to rapid advancements in technology, especially with the rise of generative algorithms that create art. The authors [26] aimed to explore the human role in generative art, particularly in music, by examining how these algorithms work and their artistic applications. They presented a specific project called Anastatica, which combines performance and installation using data-driven live coding[26] and analyze the broader effects of artificial intelligence on artistic expression, discussing how generative and artificial intelligence techniques have been used in art for decades, but their widespread application is still relatively new. In the late 2010s, advancements in technology and algorithms made it easier for artists to use AI in creating visual art and music.
The authors [26] highlighted various artistic practices that showcase how AI can enhance creativity, such as using generative adversarial networks (GANs) in visual arts and different approaches in music, illustrating the evolving relationship between humans and machines in artistic expression. Figure 9(c.f., [26]) refers to a 2020 performance of "Anastatica," which is an interactive music installation that allows the audience to influence the performance using their smartphones. This performance combines innovative technology and artificial intelligence[26], enabling audience members to interact with the music in real-time, either supporting or challenging the musician and the algorithm. The outcome of the performance is unpredictable, and changes based on the collective participation and mood of the audience, highlighting the dynamic relationship between humans and machines in music creation.
AI is significantly changing how we research and create visual art, [27] reviewed two main uses of AI in art: first, using AI to analyze and understand existing artworks, and second, using AI to generate new pieces of art. The authors [27] discussed various tasks that AI can perform, such as classifying artworks and detecting objects within them, while also exploring the practical and theoretical implications of AI in the creative process.
Figure 10 (c.f., [27]) shows how the process of digitizing art leads to quantitative analysis, knowledge discovery, and visualization through computational methods. After artworks are digitized, researchers can analyze them using advanced techniques to uncover new insights and patterns. The findings from this analysis can then improve online art collections, making it easier for users to explore and understand the art.
Figure 11 (c.f., [27]) highlighted key technological advancements that have shaped the field of AI Art. One notable method is DeepDream, created by Mordvintsev and others in 2015, which was originally intended to help researchers understand how deep convolutional neural networks (CNNs) work by visualizing the patterns that activate the network's neurons. However, DeepDream gained popularity for its ability to create surreal and psychedelic images, leading to its use as a new form of digital art.

2.2. Internal Monologues in Poetic Form Source

For this current exposition, the internal monologues in poetic form(IMPFs) by Ismail A Mageed were selected as the core poetic text. These IMPFs, rich in vivid nouns and metaphors, serves as an excellent basis for generating AI-driven artistic interpretations.
Here are the selected IMPFs to spot a sunshine on our thoughts prior to utilizing AI-driven artistic interpretations.
  • IMPF 1
  • Between Life and No-Life
  • By the pen of Ismail A Mageed
  • Our hearts are like metals… some are precious, and some are not! Some are softer than dewdrops on the cheeks of roses...
  • And some are harder than rocks… yet some rocks soften out of mercy and crack to release the gentleness and magnificence of beauty.
  • Our hearts are within our bodies… we use them to interact with people!
  • But have you ever heard of a heart that contains a person?
  • Do not be surprised, my friend!
  • Each of us needs a heart that contains a person to truly appreciate the beauty around us.
  • To spread mercy to all beings.
  • Ah, heart, I am perplexed about how to describe you!
  • Are you just a small piece between the ribs, beating to tell others we are alive?
  • Why all this silence?
  • Quiet at night, I see you.
  • Speak, O heart! So that the darkness of the night may vanish with the strength of your faith in God,
  • The Most Knowledgeable, the Highest...
  • The complete existence.
  • O heart! With your faith in God, be the light that removes the darkness of the night.
  • Do not fear the darkness of the night; it is merely passing.
  • And if there are stars in your sky, contemplate them.
  • And if your sky is cloudy,
  • It is a river where drops of waterfall...
  • Contemplate that too, without sadness or regret!
  • In the deep stillness that flows through your luminous soul with the taste of paradise,
  • There is light that quenches the thirst of your heart with the fragrance of heaven.
  • IMPF 2
  • Be a Piece of Sugar !
  • By the pen of Ismail A Mageed
  • Doing a favour is like a piece of sugar that gives out all its sweetness to people then disappears.
  • thus fades the bitterness of life by meeting the pieces of sugar in the river of life...
  • Our feet tumble in the road... but we live on hope...
  • who have never dreamt of owning a luxurious palace...?
  • Who have never dreamt of having servants and maids...?
  • Who have never dreamt of living in happiness...?
  • What is happiness ?
  • Is it a smile that hides piles of sorrows and a stubborn dream ?!
  • Happiness is inside us. We have to look for it.
  • happiness is your inside to match your outside.
  • It is to look to yourself and never turn your face away ...
  • It is to live by the innocence of infants...
  • to know that what is meant to be for you will reach you, because it's in the hands of who is Mercier with you than you to yourself... it's in the hands of Allah
  • IMPF3
  • The beauty if life
  • By the pen of Ismail A Mageed
  • The beauty of life is a word of love
  • The beauty of life is a heart’s smile
  • The beauty of the life is God's wisdom
  • The beauty of the life is challenging the hardships
  • The beauty of life is a whisper praises the granter of blessings
  • The beauty of life is a mouthful that we eat and thank for the bless
  • The beauty of the world inside me you is a word ...
  • That brings back our human again to live in the city of wisdom
  • A city that is close to us its difficult roadmap eases in one word
  • Oh, my brain !
  • Did you know the word?
  • Oh, my heart !
  • Did you know the word?
  • Allah
  • IMP 4
  • AZAN, The Heavens Call
  • By the pen of Ismail A Mageed
  • What would you feel when Heavens Call you?
  • When the divine network showers your senses with a stream of light
  • When the link between earth and heaven is built
  • When you feel with all your cells, senses and soul the presence of Allah the Exalted
  • When you prostrate to be connected to the divine network, to taste the true love of Allah
  • When you want to meet with a king, you need cronies to pass a long queue request to see him
  • The appointment could be too short and upon the discretion of that king to be terminated at any time
  • With the divine call, Allah the Exalted ,The True King Of All Kings always want you and spaces, and you are the only one to end the appointment !!
  • Allah never close HIS doors at all times and spaces ... HE Loves us more than a mother to her suckling newborn baby…

2.3. Prompt Structure

It is a burning question that needs to be answered while communicating AI models, that is: what is the best choice of prompt structure[39,40,41,42,43,44,45,46,47,48,49,50,51].
Deep generative models are advanced AI systems that can create high-quality digital content[39], like images or text, but they can be difficult for users to control effectively. A recent approach called prompting [39] allows users to give simple text instructions to the AI, enabling it to perform tasks without needing extensive training, known as zero-shot or few-shot learning. However[39], many users still struggle to write effective prompts, often relying on trial and error, which highlights the need for better user interfaces that can help guide this interaction in creative applications.
Reading through the folds of [39], we can address the challenges of using prompts in AI systems, highlighting that designing effective prompts often relies on trial and error due to a lack of systematic research. This points out that current user interfaces do not adequately support the creation and exploration of complex prompts, making it difficult for users to understand and remember how to use them effectively. Additionally[39], there are concerns about delays in processing, the generalizability of prompts across different AI models, and the potential for biases in the content generated by these systems.
In the graphical user interface (GUI) example(See Figure 12(c.f., ]39]) , users can type in their requests using everyday language, and the system will automatically understand and break down their input. This identifies important details[39], like what the user wants to do and any specific requirements, which can then be adjusted directly by the user. After this, the system generates a more polished text prompt that can be used by a generative model to produce the desired output.
In Figure 13 (c.f., [39]), users can create prompts by choosing from a set of predefined options or "building blocks" that are effective for common tasks. This approach makes it easier for users to generate prompts without having to write everything from the beginning. However, users can still enter their own custom text if they prefer, allowing for flexibility in how they create prompts.
Figure 14(c.f., [39]) showcases how users are allowed to create prompts that guide a storytelling process, referred to as a "narrative tree." Each prompt generates different possible responses, and users can choose some of these responses to build on for their next prompts, effectively shaping the direction of the story. This interactive approach helps users explore various narrative possibilities and develop their creative writing.
As depicted in Figure 15 (c.f., [39]), users are permitted to work with AI in a way that doesn't require them to wait for the AI to respond. Users can add prompts or questions to their text document at any point, which sends requests to the large language model (LLM) system to generate text. While the AI processes these requests, the user can keep writing or editing other parts of the document, making the collaboration more efficient and fluid.
The Minstrel framework is designed to generate structured prompts using multiple agents that work together in three main groups: the analyze group, the design group, and the test group. The analyze group focuses on understanding user needs and feedback, the design group creates the prompts, and the test group evaluates their effectiveness. In the design group, activated modules are shown in blue, while those not needed for the current task are marked in green, indicating a flexible approach to prompt generation, as in Figure 16 (c.f., [40]).
Looking at Figure 17 (c.f., [4]) as a visual example where ChatGPT-3.5 responds to a user using three different prompts about a fictional place called Mingyuan University. The responses show how the AI generates flattering and positive comments about the university[40], even though it is not a real institution. This highlights how AI can produce varying outputs based on the prompts it receives, demonstrating its ability to adapt to different contexts and user inputs.
Prompt engineering is a crucial practice in natural language processing (NLP) that involves creating specific inputs[41], or prompts, to guide language models in generating the desired outputs. The effectiveness of these AI models heavily relies on the quality of the prompts provided[41], as well-crafted prompts can significantly improve the accuracy and relevance of responses, sometimes increasing accuracy rates from 85% to 98%. Techniques in prompt engineering include ensuring clarity[41], specificity, and context, while also addressing potential biases to enhance the overall interaction between users and AI systems.
The following prompt was used as the foundation for each image generation:
Prompt: "Create me a visual representation of this IMPF including every noun in this."

2.4. Image Generation Settings

The AI-generated images settings [52,53,54,55,56,57,58] were:
  • Model/Preset: Cinematic Kino
  • Contrast: Medium
  • Batch Testing: Each generation produced 4 images, and this process was repeated in 3 separate batches for a total of 12 images per poem.
in broad terms, new advancements in detecting synthetic images are essential for combating disinformation[52], especially as generative AI models create highly realistic images quickly and at large scales. The undertaken exposition of [52] has successfully identified the challenge of training detection systems to recognize different types of synthetic images, like distinguishing between human faces and animal images. The authors propose a method that improves detection by selecting high-quality synthetic images for training, which leads to better performance in identifying fake images across various categories.
Figure 18 (c.f., [52]) visually presents two ways to evaluate how well a synthetic image detection system works. In the "cross-architecture" setting, the system is trained on images created by one type of generative model and tested on images from a different model. In the "cross-concept" setting, the training and testing are done using images of different subjects (like animals and humans) but generated by the same model, allowing researchers to see how well the system can adapt to different types of images.
The proposed method in [52] has two main parts: first, it selects the highest quality synthetic images to create a training set, and second, it trains a convolutional neural network (CNN) to distinguish between real and fake images. The quality of the generated images is measured using a new metric called Quality Calculation (QC), which ranks images based on how closely they resemble real images. This approach helps the network learn important details that improve its ability to detect fake images across different categories.
The researchers [52] found that the best-quality generated image has a spectrogram that closely resembles the spectrogram of a real image, indicating that it captures similar features. In contrast, the worst-quality generated image shows a noisier spectrogram, suggesting it lacks clarity and detail. This observation supports the idea that training models with higher-quality images helps them learn finer details and improve their ability to distinguish between real and synthetic images.
In the context of desktop films[58], the "primary image" refers to the main visual area that the viewer sees, which is often framed by elements of the computer's graphical user interface (GUI), like the menu bar or desktop background. The "secondary image" consists of smaller visual elements, such as software windows or applications, that appear within the primary image and can serve as the focus of the film. These secondary images can also contain their own images, leading to the idea of "tertiary" and "quaternary" images, which adds layers to the visual experience, as showcased by Figure 19 (c.f., [58]).
The term "kino-brush" is not suitable for describing desktop films because it doesn't capture their unique qualities[58], leading to the proposal of "kino-software" as a better metaphor. Desktop films exist in a space between traditional photography and digital media[58], focusing on online realities rather than conventional cinematic experiences. This shift highlights how the process of creating desktop films involves a more direct engagement with digital content, akin to a surgeon's precise work[58], which contrasts with the more detached approach of traditional filmmaking.
The phrase "From Kino-brush to Kino-software" suggests a shift from traditional filmmaking techniques, likened to painting (Kino-brush)[58], to modern digital methods (Kino-software). This change highlights that the way we create films has evolved significantly with digital technology, making the old metaphors less relevant. Scholars argue that since digital cinema emerged, the tools and processes used in filmmaking have transformed[58], leading to new ways of producing and understanding visual media.
Manovich's metaphor of the "cinematic image-as-painting" highlights how digital filmmaking techniques[58], like computer-generated imagery (CGI) and virtual cameras, have transformed the filmmaking process. Instead of just capturing a physical scene through the camera lens[58], filmmakers now treat the real world as one of many elements they can manipulate to create their final images. This shift allows for greater creativity and flexibility, like how artists use various materials in painting or animation.
A desktop film is an audiovisual work that showcases the computer interface[58], often created using screen recording software or digital composition. This includes a primary image (the main visual) and secondary images (elements like menus and windows) that interact within the frame[58], creating a layered visual experience. This form of media can encompass various types of content [58], such as video tutorials or live streams, and does not require the creator to have intended it as a film for it to be classified as one.

2.5. Evaluation Criteria

Creative natural language generation[59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84], like poetry writing, is interesting but hard to assess because there aren't clear standards for what makes a good poem. The authors focus on image-inspired poetry generation, where poems are created based on uploaded images, and they explore two main challenges: how to evaluate poems without a definitive answer and how to assess systems that produce different poems from the same image. They develop tools to rate poems and measure how unique they are compared to existing works, as well as strategies to evaluate the variety of poems generated from the same image input.
Evaluating creative AI[59], like poetry generation, is best done by humans, but it requires a well-designed tool and clear guidelines to ensure reliable ratings. Instead of showing an image and poem together each time, which can lead to inconsistent ratings, the authors [59]proposed a method where assessors compare multiple poems side by side after viewing an image. They[59] provided specific criteria for assessors to consider, such as the correctness of language and the poem's relevance to the image, allowing for a more structured and fair evaluation of the generated poetry.
The human assessment tool[59] mentioned in the text is created to evaluate different methods of generating poetry by comparing how they rank against each other and giving them specific scores. In the context of translating Chinese poetry into English[59], each comma or period in the translation marks the end of a line, reflecting the structure of the original Chinese verses. This approach helps maintain the integrity of the poetry's form while allowing for effective evaluation of its quality, as depicted in Figure 20 (c.f., [59]).
The evaluation results mentioned refer to a comparison of eight different methods used to generate content, likely poems(See Figure 21 (c.f., [59]), based on how humans rated them. The results focus on two key aspects: novelty, which measures how unique or original the generated content is, and diversity, which assesses the variety within the generated outputs. By analyzing these ratings[59], researchers can determine which methods are most effective at producing creative and varied results.
The release of advanced natural language generation (NLG) algorithms[60], like GPT-2, has generated significant public interest because these algorithms can create text that resembles human writing. In their experiments[60], the researchers found that participants struggled to tell the difference between poems written by the algorithm and those written by humans when the best algorithm-generated poem was selected, but they could identify them when a random one was chosen. Additionally[60], people showed a slight preference against algorithm-generated poetry, regardless of whether they knew it was created by an algorithm or not, F
Figure 22 (c.f., [60]) provides Violin plots are a type of graph that show the distribution of data, in this case, the preferences of participants for human-written poetry compared to algorithm-generated poetry. The left side of the plot represents the "Transparency treatment," where participants knew the source of the poems[60], while the right side shows the "Opacity treatment," where they did not. The results indicate that participants preferred human-written poems in both situations[60], and the confidence intervals suggest that this preference was consistent regardless of whether they were aware of the poem's origin.
The study highlighted [60] the ethical implications of language-generation algorithms, like GPT-2, which can create text for various purposes beyond poetry, such as online reviews or news articles. While these algorithms show potential in mimicking human writing[60], they lack true creativity and emotional expression.
The authors [61] have explored how to automatically create poetry based on images. They have also addressed challenges like identifying poetic themes from images and ensuring that the generated poems are both relevant to the images and artistically expressive. They [61] developed a method that uses advanced training techniques and created two datasets to improve the quality of poetry generation, showing that their approach outperforms existing methods.
Figure 23 (c.f., [61]) portrays a human-written description of an image with a poem inspired by the same image. While the description focuses on straightforward facts[61], like what is happening in the image, the poem goes deeper by using symbols and emotions, such as relating a falcon to a knight and expressing ideas of hunting and waiting. This shows how poetry can convey more complex feelings and meanings than simple descriptions.
Figure 24 (c.f., [61]) visualizes the framework for poetry generation uses a deep coupled visual-poetic model that learns from pairs of images and poems created by humans. It analyzes [61] images to extract features like objects and emotions using a Convolutional Neural Network (CNN), while the poems are processed to identify their structure and meaning through a skip-thought model. A Recurrent Neural Network (RNN) generates the poems[61], and two discriminators evaluate whether the generated poems match the images and maintain a poetic style, providing feedback to improve the poem generation process.
Figure 25 (c.f., [61]) refers to poems created using eight different methods based on images, showcasing how each method generates unique poetic expressions. The words highlighted in red indicate the level of "poeticness," [61] which suggests how artistic or expressive the generated poems are. This comparison helps evaluate the effectiveness of each method in producing meaningful and creative poetry from visual inputs.
The AI-generated images were evaluated based on:
(i).
Artistic Quality[85,86,87,88,89,90,91,92,93,94,95]: Alignment with Abstract Expressionist aesthetics, including the use of abstraction and emotion.
Another study [85] has explored how digital artists in the entertainment industry view the rise of AI-generated art. Through interviews[86], artists expressed concerns about AI art being unethical, lacking human intention and expression, and not allowing for the creative process they value. While they acknowledged that AI could produce visually impressive images, they felt that these artworks often lack the emotional depth and originality that come from human creativity.
On another note, [86] discussed the negative effects of image generators on artists, including financial losses and damage to their reputations. It suggests regulations to protect artists, such as requiring consent before using their work to train these generators and emphasizes that art is a human activity that should not be replaced by automation. Most importantly, [86] highlighted Anna Ridler's project, where she created her own training data from tulip photos, as an example of how image generators can be used to enhance creativity rather than exploit artists' work.
Artists [87] increasingly feel that using generative AI systems to create images is like collaborating with another artist. The author [87] examined different types of collaboration from philosophical perspectives—like collective authorship and co-creatorship—to see if human-AI interactions fit these definitions. Ultimately[87], the author suggests that while AI can assist in creating art, it doesn't fully qualify as a collaborator in the traditional sense, and we should refer to these interactions as "AI-assisted production" instead.
Looking the other remit of the spectrum, [88] investigated how people view artwork created by artificial intelligence (AI) compared to that made by human artists. They[88] conducted a survey where participants evaluated six artworks, some made by AI and some by humans, and found that people generally did not see AI-created art as having the same artistic value as human-created art. Interestingly[88], the participants' beliefs about AI's ability to create art influenced their evaluations, suggesting that preconceived notions about AI affect how we judge its artistic contributions.
(ii).
Emotional Resonance: The ability to evoke feelings associated with the poetic prompt[95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126].
AI models like ChatGPT-3 can effectively imitate the writing styles and vocabulary of different professions, raising the question of whether machines can truly create art. Thomas Carlyle's observation [96] highlighted a concern that our understanding of life has become overly analytical, stripping away the sense of wonder. To better grasp what it means to be human and creative[96], we should focus on the emotions and fears that drive us to express our experiences, as these elements are central to our understanding of creativity.
Another study [97] has explored how postgraduate English Literature students at the Lebanese University respond to Shakespeare's "Sonnet 18" and a sonnet created by ChatGPT, both focusing on the theme of timeless beauty. By using quantitative methods[97], the study measures students' appreciation, emotional depth, and language complexity of the two poems. The results showed that students preferred Shakespeare's sonnet because of its richer language and deeper emotional impact, while also discussing the implications of AI in creative writing and areas for improvement in AI-generated poetry.
Meanwhile, [98] showed how artificial intelligence (AI), particularly ChatGPT, is changing the way poetry is created and understood. It [98] has also highlighted that while AI can generate poems that mimic human expression and use literary devices, its creativity is limited by the data it has been trained on and the prompts given by users. Ultimately, the research suggests that AI can be a useful tool in poetry, but it lacks the deep emotional insight and originality that human poets bring to their work.
Figure 26 (c.f., [99]) illustrates how emotions can be represented and processed in generative AI models. The authors [99] introduced two tools: EmotionPrompt, which enhances the AI's ability to recognize and generate emotional content, and EmotionAttack, which aims to disrupt or weaken this emotional performance. Additionally[99], EmotionDecode provides insights into how emotional triggers influence the behavior of these AI models.
Figure 27 visualizes the key results of a study called EmotionDecode[99], which evaluates how well different layers of the Llama2-13b model perform on various tasks when using specific emotional prompts. Each cell in the results table shows the performance level[99], with red indicating better performance and blue indicating weaker performance. Additionally[99], the results for GPT-4 are derived by applying the same prompts from Llama2 to see how it compares in terms of effectiveness.
Figure 28 (c.f., [99]) showcases two methods, EmotionPrompt and EmotionAttack, used to evoke emotions in generative AI. In the first method (a and c), emotional cues are added directly to the text prompts to influence the AI's response. In the second method (b and d), images that convey similar emotional meanings are created and used as prompts for multi-modal models, which can process both text and visual information to generate responses.
Generative AI tools [107] are changing how we think about and create poetry and art by introducing new methods and materials that blend technology with creativity. As traditional boundaries between different art forms fade[106], artists are increasingly collaborating with AI, robots, and virtual beings to produce innovative works. While AI can generate poetry from images and text[106], it still struggles to fully replicate the depth of human creativity, raising questions about how these technologies might serve as a measure of human artistic expression in an AI-driven world.
Figure 29 (c.f., [107]) refers to a specific artwork from "Poèmes et Lithographies," a collection created by Pablo Picasso in 1949 and published in 1954. This collection combines Picasso's visual art with poetry, showcasing his innovative approach to integrating different forms of artistic expression. The mention of the Museum of Modern Art (MoMA) indicates that this work is part of a significant collection that highlights Picasso's contributions to both art and literature.
Figure 30(c.f., [107]) provides a visual justification leaf for copy 46 of Picasso's "Poèmes et Lithographies" is a page that explains the reasons for the book's creation and its significance, and it is personally signed by Picasso himself. This document adds value to the artwork by providing context about the collaboration between visual art and poetry, showcasing Picasso's involvement in the artistic community of his time. Such signed items are often sought after by collectors because they connect directly to the artist and their work, enhancing the historical and artistic importance of the piece.
Figure 31 refers to an image created for the poem "The Digital Abyss" by Tula Giannini using a generative art tool called the Pixray app. This image was generated based on a specific text prompt taken from the poem, which reflects the themes and ideas expressed in the poem itself. The use of generative art in this context highlights how technology can enhance and visualize literary works, creating a unique blend of poetry and visual art.
Here is the poem
Preprints 145172 i002
(iii).
Interpretative Creativity: The originality and depth of the AI’s visual interpretation[127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146].
This study [127] compared the creative abilities of artificial intelligence (AI), specifically ChatGPT-4, with human creativity using a tool called the Figural Interpretation Quest (FIQ). Participants[127], including both AI and humans, were asked to interpret abstract figures creatively, and the results showed that while AI was more flexible in generating ideas, humans were rated as more creatively impressive. Overall[127], the findings suggest that AI can produce diverse interpretations, but it still falls short of matching the depth and complexity of human creativity.
Figure 32 (c.f., [127]) illustrates the comparison of creativity and flexibility scores between interpretations generated by GPT-4, an AI model, and those created by human participants across various trials. Creativity scores measure how original and imaginative the interpretations are[127], while flexibility scores assess the diversity of ideas presented. By analyzing these scores, researchers can evaluate how well GPT-4 performs in generating creative responses compared to humans.
Combinational creativity is a type of creativity that involves mixing familiar ideas to create something new[128], which is important for innovation in design. This study [128] explored how to use artificial intelligence to analyze and understand these creative designs by identifying their basic and additional components. The authors [128] developed a special algorithm that combines computer vision and natural language processing, achieving high accuracy in interpreting these components, while also examining the strengths and weaknesses of their approach.
Figure 33 (c.f., [127]) visualizes "three combinational creativity driven approaches" in reference to methods for creatively combining different design elements, specifically 'base' and 'additive' components, to solve complex design problems. These approaches include the problem-driven approach[127], which focuses on addressing specific challenges; the similarity-driven approach, which looks for connections between similar concepts; and the inspiration-driven approach, which draws from broader ideas or inspirations. The study [128] built on previous research to explore how these approaches can be applied in practical design tasks, enhancing the understanding of how creativity can be systematically harnessed in design processes.
The architecture of the context aware RE (Relation Extraction) model is designed to analyze and connect text and images by transforming them into a shared format that allows for comparison, as depicted in Figure 34 (c.f., [128]). In this model[128], both the text (like noun entities) and images are converted into high-dimensional vectors using a neural network called CLIP, which helps determine how similar they are to each other. This approach enables the model to effectively identify relationships between the text and images based on their compatibility scores.
Figure 35 (c.f., [128]) offers a visual representation of the architecture of the MEGA model refers to its structure and components used for processing data, particularly in tasks that involve both images and text. It includes learnable features like weights and biases that help the model improve its performance over time. The model combines visual information extracted from images using a pre-trained Faster R-CNN and text features from a pre-trained BERT, allowing it to analyze and generate outputs based on the relationships between the visual and textual data.
On another note, Figure 36 (c.f., [128]) visualizes relation extraction is a process used in natural language processing and computer vision to identify and understand the relationships between different entities in images or text. In the context provided[128], part (a) shows an image related to the term "Bionic," while part (b) visualizes the results of the relation extraction, illustrating how the system interprets and connects different elements from the image. This helps improve the performance of models by allowing them to better understand the context and relationships present in visual data.
Ambiguity in language is a significant challenge for natural language processing (NLP)[145], which is the field focused on how computers understand human language. Despite advancements in AI tools like ChatGPT[145], researchers find that it struggles with complex language situations, such as new words, mixed languages, and sentences that have multiple meanings. The authors [145] examined these limitations and suggests that improving ChatGPT's ability to handle such linguistic challenges is essential for enhancing its overall performance.
ChatGPT's response to wordplay refers to how the AI interprets and generates language that plays with words[145], such as puns or jokes that rely on double meanings. This ability is important because it shows whether the AI can understand and engage with creative uses of language[145], which often involve ambiguity and cleverness. The examination [145] of its responses helps researchers assess how well the AI grasps linguistic creativity and can handle complex language situations, as depicted in Figure 37 (c.f., [145]).
In discussing the interplay between poetics and AI generation with arts[147,148,149,150,151,152,153,154,155,156,157,158,159,160], it so intriguing and mind blowing, as a unified linkage between the triad: poetics, AI and arts.
In the last five years[147], the use of artificial intelligence (AI) in art has grown significantly, with artists exploring both the technology itself and its social implications. This research [147] focused on how artists can use new AI tools and techniques, even if they lack technical backgrounds, to create innovative works that blend body movements, text, and AI-generated language. By applying a bricolage methodology[147], which emphasizes creativity through chance and context, the study reveals how AI can serve as a powerful tool for artistic expression and remixing ideas.
The term "poetics" [147] refers to a complex set of ideas related to words like "poetic" and "poiesis," which means "to make" or "to produce" in Greek. While many definitions of "poetic" focus on poetry and poets, one broader definition highlights its connection to artistic creation and composition. This understanding emphasizes the role of imagination and creativity in the process of making art, showing how poetics encompasses more than just traditional poetry.
An alternative way to understand artificial intelligence (AI) focuses on the different techniques and methods used in the field[147], which can be particularly useful for artists. Francesco Corea created a draft called the AI Knowledge Map (AKIM) to organize and visualize the various approaches to AI developed over the past 60 years, addressing the issue that many existing categorizations are incomplete or fail to show important connections within this evolving area. This map aims to provide a clearer overview of the diverse problems AI researchers are tackling, as in Figure 38 (c.f., [147]).
Figure 39(c.f., [147]) illustrates how computer vision, artificial intelligence (AI), and machine learning (ML) are interconnected, forming a link between AI/ML-enabled art and interactivity. In this context[147], computer vision acts as a crucial part of the interface that allows for interactive experiences in art created with AI and ML technologies, describing computer vision as a new way of social interaction that can create uncertainty about what we can see and what remains hidden from our perception.
From a different viewpoint, the overview of the Design Thinking methods employed [148] highlights how these methods are interconnected and work together throughout the project. Design Thinking is a creative approach to problem-solving that involves stages like discovery, interpretation, and prototyping, allowing researchers to iteratively develop and refine their ideas. In [148], these methods help guide the exploration of poetry generation and ensure that the process remains user-focused and adaptable to feedback, as in Figure 40 (c.f., [148]).
From a mathematical perspective, another study explores how teachers[149], both in training and already working, use artificial intelligence (AI) to create mathematical poetry and songs, which helps them experience math in a more artistic way. Conducted in Brazil[149], the research involved seven participants in an online course where they collaborated to produce these creative works. The findings showed that AI tools like ChatGPT were important in sparking creativity, but the teachers' expertise was crucial in shaping the AI-generated ideas into effective teaching materials.
The data collected in the study[149], including the poems created by ChatGPT, were initially written in Portuguese. To analyze and present the findings in the paper, the authors translated this content into English, but they noted that important features like rhythm and rhyme were changed during the translation. For instance, a stanza about the golden rectangle was translated, but the poetic structure may not have been preserved in the English version, as in below:
Preprints 145172 i003
In the initial interactions [149] between PST1 and ChatGPT, PST1 asks for surprising mathematical topics, prompting ChatGPT to list several intriguing concepts. These include Gödel's Theorem, which deals with the limits of provability in mathematics; Prime Numbers, which are fundamental in number theory; and the Fibonacci Sequence, known for its appearance in nature and art. This exchange highlights the diverse and fascinating areas within mathematics that can spark curiosity and further exploration.
The Fibonacci Sequence [149] is a well-known mathematical series that begins with the numbers 1 and 1, where each following number is the sum of the two preceding ones, resulting in a pattern like 1, 1, 2, 3, 5, 8, and so on. This sequence was popularized in the West by the Italian mathematician Leonardo Fibonacci in the 13th century, although it had been recognized in India long before that. The Fibonacci Sequence has fascinating properties and appears in various fields, including nature, art, and science, showcasing its broad relevance and significance.
The request is to create two stanzas of poetry: the first should highlight the intriguing and surprising aspects of mathematics, emphasizing its vastness and the secrets it holds for exploration. The second stanza should specifically focus on the Fibonacci sequence, illustrating its unique charm and its connections to nature and art, showcasing how this mathematical concept manifests in various forms around us. This exercise combines creativity with mathematical concepts, demonstrating how poetry can express complex ideas in an engaging way, as in below:
Preprints 145172 i004
In Group 2's musical composition activities[149], the focus was on the vocal recording process, where two participants, PS5 and T2, recorded their parts for a song under the guidance of a lead researcher who also acted as their teacher. They used Logic Pro X, an audio production software, to create their vocal tracks. However, during the review, it became clear that some of the pitches did not match the intended musical tone, which highlighted the need for further editing to correct these discrepancies, as shown by Figure 41 (c.f., [149]).
To fix the issues with the vocal recordings, the lead researcher used the audio editing software Logic Pro X, which has advanced tools for manipulating sound. This careful process involved adjusting the pitches that were off-key to ensure they matched the intended harmony of the song. The adjustments made were visually represented in Figure 42 (c.f., [149]), showing how each incorrect pitch was corrected to create a more cohesive musical piece
The study [149] highlighted how integrating AI, like ChatGPT, in education can transform the way mathematics is taught, especially through creative forms like poetry and song. In Group 1's exploration of the Fibonacci sequence, teachers worked alongside AI to create a richer learning experience that reveals the beauty of mathematical concepts. This approach helps students see mathematics as more than just numbers; it becomes a creative and engaging subject that connects with their experiences.
Group 2's exploration of the golden ratio through poetry shows how different forms of media can enhance educational content[149], making complex mathematical ideas easier for students to understand and enjoy. This approach emphasizes the importance of media as a creative partner in teaching, helping to create lessons that are not only informative but also memorable and engaging. The study suggests that combining technology, art, and education can lead to more effective learning experiences in mathematics.
Indeed, mathematical elements [149] in poems can surprise and engage readers by influencing their understanding and emotions in unexpected ways. It emphasizes that these mathematical concepts are dynamic and can lead to new insights when presented creatively, rather than being static information found in textbooks. Additionally, the study[149] highlighted the collaboration between teachers and AI tools like ChatGPT, where teachers refine AI-generated content to create engaging and meaningful educational materials for teaching mathematics in K-12 classrooms.
Most importantly, The study [149] highlighted how pre-service teachers and AI tools like ChatGPT can work together to create engaging educational content for teaching mathematics. While ChatGPT helps generate creative ideas, the teachers play a crucial role in refining and enhancing this content to ensure it is both mathematically sound and pedagogically effective. This collaboration aims to make learning math a more enjoyable and aesthetically rich experience for students in K-12 classrooms.
(iv).
Fidelity to Prompt: Faithfulness to the themes and imagery described in the poetry[161,162,163,164,165,166,167,168,169,170,171,172,173,174,175].
The exposition [161] has successfully explored how Generative AI (GenAI) is becoming increasingly important in various industries like finance and healthcare, where accuracy is crucial. It [161] has also spotlighted the limitations of traditional methods that use fixed prompts, which can lead to incorrect or irrelevant information. To improve GenAI's performance, the authors [161] proposed a new method called adaptive prompt reinforcement learning, which uses human feedback to continuously refine the prompts, making the AI more accurate and reliable in complex situations.
Static prompts [161] in Generative AI (GenAI) can lead to several challenges that affect the quality and reliability of its outputs. One major issue is "hallucinations," where the AI generates plausible but incorrect information, which can be particularly harmful in fields like healthcare and law. Additionally, static prompts often struggle with unique situations (edge cases) and can produce repetitive or irrelevant responses, making it essential to develop adaptive prompts that improve accuracy and user engagement.
On the other hand, DreamBooth[164] is an AI tool that can create many different images of a person or object using just a few reference images, usually between three to five. By providing a text prompt that describes the desired context or scenario, DreamBooth generates images that show the subject interacting naturally with various environments and under different lighting conditions. This process [164]ensures that the essential visual features of the subject remain consistent, resulting in high-quality and realistic images, as in Figure 43 (c.f., [164]).
Recontextualization[164] refers to the process of generating images of subjects in various environments while maintaining important details about the subjects and ensuring realistic interactions between the subjects and their surroundings. The results show that using a method called Prior Preservation Loss (PPL) allows for greater diversity in the generated images, even if it slightly reduces how accurately the subjects are represented. Additionally, when the model is trained with the correct class name for the subject, it can produce better results, but using incorrect or no class names can lead to errors in the generated images(See Figure 44 (c.f., [164]).
Figure 45 (c,f., [164]) shows some challenges that can occur when using generative AI models to create images based on prompts. For instance, the model might struggle to accurately generate the desired context, leading to incorrect images, or it might mix up the appearance of the subject with the context, resulting in unexpected changes. Additionally, if the prompts are too like the original images the model was trained on, it may produce images that look too much like those training examples, rather than creating something new.

3. Results and Analysis

3.1. Visual Interpretation of Poetic Prompts

Leonardo AI demonstrated an impressive ability to capture the essence of poetic prompts through Abstract Expressionism. For "Between Life and Non-Life," the AI-generated images revealed a spectrum of interpretations:
  • Symbolism and Metaphor: The generated visuals showcased significant use of symbolism, with recurring motifs such as hearts, water, and light representing existential themes of love, life, and transition. For example, the heart suspended in water symbolized fragility and resilience, encapsulating the poem's dualities.
  • Depth of Meaning: The images went beyond literal interpretations of the nouns, embodying deeper metaphysical concepts. Waterfalls represented the continuous flow of life, while the glass heart reflected transparency and vulnerability, aligning closely with the poem’s contemplative tone.
  • Batch 1:
Preprints 145172 i005Preprints 145172 i006Preprints 145172 i007
  • Batch 2:
Preprints 145172 i008Preprints 145172 i009Preprints 145172 i010
  • Batch 3:
Preprints 145172 i011Preprints 145172 i012

3.2. Emotional Resonance

The images successfully conveyed the emotional depth of the poem, with the use of colour, form, and abstraction creating a dynamic interplay of existential themes. Medium contrast settings enabled balanced expressions of vibrancy and restraint, reflecting the duality inherent in the poem’s themes.
The cinematic kino preset provided a medium contrast that enhanced the interplay of light and shadow, a critical element in Abstract Expressionism. By using advanced AI rendering techniques, Leonardo AI infused poetic metaphors into the art, leveraging textures, layers, and focal points to emphasize symbolic elements.
For instance, the interaction between light reflections and water surfaces underscored themes of impermanence and duality, while the juxtaposition of natural and artificial imagery mirrored existential contrasts in the poem. These techniques allowed the AI to create nuanced interpretations, illustrating how symbolic abstraction can extend beyond textual constraints.

3.3. Visual Interpretation of Poetic Prompts

The collection of AI-generated images successfully captured the abstract and emotive essence of the poem " Be a Piece of Sugar". The visuals utilized recurring motifs such as water, hearts, light, and abstract textures to explore themes of fragility, duality, and existential uncertainty.
  • Symbolism and Metaphor:
    • Water: A dominant element in the collection, water was depicted in various forms, such as rippling streams, reflective surfaces, and cascading flows. It symbolized life's continuous movement and the transient nature of existence, aligning closely with the poem’s theme of impermanence.
    • Hearts: The heart motif, often placed within water or suspended in abstract forms, represented emotional vulnerability and resilience. Some visuals used fractured or transparent hearts to highlight the fragility of life and love.
    • Light and Shadow: Contrasts of light and shadow played a critical role in conveying existential dualities. Soft glows and sharp beams suggested moments of hope and introspection, while darker regions hinted at themes of uncertainty and non-life.
    • Abstract Expressionist Principles: The collection embraced abstraction through layered textures, distorted shapes, and unconventional color palettes. The lack of rigid structure allowed viewers to engage with the visuals on a deeply emotional and interpretive level, reflecting the open-ended nature of the poem's metaphors.
    • Emotional Depth: The interplay of soft hues with sharp contrasts created a visual narrative that mirrored the poem’s contemplative tone. Viewers were drawn into a reflective state, experiencing the tension between fragility and strength, life and non-life, love and loss.
  • Batch 1:
Preprints 145172 i013Preprints 145172 i014Preprints 145172 i015
  • Batch 2:
Preprints 145172 i016Preprints 145172 i017
  • Batch 3:
Preprints 145172 i018Preprints 145172 i019

3.4. Emotional Resonance

The images elicited strong emotional responses through their use of colour, abstraction, and symbolism:
  • Colour and Tone: The collection featured a blend of warm, muted tones alongside cooler, sombre shades. This balance evoked feelings of both hope and melancholy, mirroring the emotional duality in the poem.
  • Light Effects: The depiction of light as a recurring motif symbolized clarity, hope, and divine intervention, resonating with the poem’s metaphysical themes.
  • Abstract Forms: The abstract and fragmented visuals allowed viewers to project their own emotions onto the imagery, creating a personalized and introspective experience.
  • The collection succeeded in bridging the gap between textual and visual mediums, enabling viewers to connect with the poem’s existential themes on an intuitive level.

3.5. Visual Interpretation of Poetic Prompts

The AI-generated images for the poem “The Beauty of Life” collectively captured its uplifting themes of joy, love, inner happiness, and resilience. The visuals reflected Abstract Expressionist aesthetics, blending symbolic and abstract representations to express the poem's deep philosophical ideas.
  • Symbolism and Metaphor:
    • Light and Radiance: Light was the most prominent motif in the images, appearing as radiant beams, glowing orbs, or soft illumination. This symbolized divine wisdom, blessings, and the internal search for happiness. In many visuals, the light seemed to emerge from darkness, reflecting the poem’s emphasis on overcoming challenges and finding inner peace.
    • Hearts: Representing love and emotional resilience, hearts were frequently depicted as glowing or transparent, signifying purity, vulnerability, and connection. In some instances, fragmented or layered hearts symbolized the complexity of emotional experiences.
    • Smiles and Joy: While not depicted literally, abstract curves and warm colour gradients suggested happiness, aligning with the poem’s focus on emotional and spiritual well-being.
    • Overcoming Hardships: Darker elements, such as fragmented textures and shadowy areas, represented the challenges and adversities referenced in the poem. These were often juxtaposed with vibrant hues and soft light, illustrating the triumph of hope and resilience.
Abstract Expressionist Principles:The images adhered to Abstract Expressionist techniques, favouring non-representational forms, fluid shapes, and layered compositions. The absence of rigid structures allowed for open-ended interpretations, inviting viewers to connect emotionally with the visuals.
Emotional Depth and Narrative: The collection created a visual journey, transitioning from darker, more somber tones to brighter, more vibrant imagery. This progression mirrored the poem’s narrative arc, which moves from the acknowledgment of life’s hardships to the realization of happiness and divine blessings. Each image, through its use of abstraction, conveyed themes of love, wisdom, and the beauty of existence, inspiring reflection and introspection.
  • Batch 1:
Preprints 145172 i020Preprints 145172 i021
  • Batch 2:
Preprints 145172 i022Preprints 145172 i023
  • Batch 3:
Preprints 145172 i024Preprints 145172 i025

3.5. Emotional Resonance

The emotional impact of the images was profound, with the interplay of colours, textures, and abstract forms evoking a sense of hope, joy, and philosophical contemplation:
  • Colour Palette: The collection predominantly used warm golds, soft blues, and gentle whites to evoke feelings of peace and harmony. Darker tones, such as grays and blacks, represented struggles and adversities but were often balanced by brighter elements, symbolizing hope.
  • Light as a Guiding Force: The radiant depictions of light were emotionally compelling, symbolizing divine presence and the pursuit of happiness, consistent with the poem’s themes.
  • Abstract Curves and Shapes: By avoiding literal depictions of smiles or emotions, the images encouraged viewers to engage personally with the visuals, creating an emotional and introspective experience that mirrored the poem’s focus on inner happiness.

3.6. Visual Interpretation of Poetic Prompts

The collection of AI-generated images inspired by “Azan, The Heavens Call” effectively translated the poem’s spiritual, emotional, and contemplative essence into visual form. The visuals were rooted in Abstract Expressionist principles, employing light, abstract shapes, and layered compositions to depict the themes of divine connection, spiritual awakening, and the eternal bond between humanity and the divine.
  • Symbolism and Metaphor:
    • Light as Divine Connection: Light was the most dominant visual element, often depicted as radiant beams, glowing auras, or shimmering reflections. This represented the divine network described in the poem, symbolizing Allah’s guidance, mercy, and ever-present love. In several images, light appeared to emanate from a central point, reflecting the spiritual call to prayer and the connection between the heavens and the earth.
    • Abstract Forms of Prostration and Connection: Curved shapes and layered textures suggested prostration and submission, emphasizing the humble act of connecting with Allah. These forms subtly mirrored human gestures of devotion, such as bowing or reaching upward, creating a deeply spiritual narrative.
    • Interplay of Earth and Heaven: The visuals often juxtaposed elements of light and shadow, as well as fluid, organic shapes with angular, structured forms. This contrast reflected the poem’s themes of bridging the gap between the material world and the divine realm.
Abstract Expressionist Principles:The images adhered to the Abstract Expressionist style through their use of fragmented shapes, flowing textures, and evocative colour palettes. The absence of rigid structure allowed for open-ended interpretations, inviting viewers to experience the visuals on a deeply emotional and spiritual level. The abstraction effectively conveyed the metaphysical themes of the poem, focusing on the universal and intangible nature of divine connection.
Emotional and Spiritual Depth: The collection captured the transformative emotional experience described in the poem. Through the interplay of radiant light, ethereal shapes, and contrasting tones, the images conveyed feelings of awe, devotion, and inner peace. The visuals served as a reminder of the infinite mercy and love of Allah, resonating deeply with the poem’s themes.
  • Batch 1:
Preprints 145172 i026Preprints 145172 i027
  • Batch 2:
Preprints 145172 i028Preprints 145172 i029
  • Batch 3:
Preprints 145172 i030Preprints 145172 i031

3.7. Emotional Resonance

The AI-generated images successfully conveyed the spiritual and emotional intensity of “Azan, The Heavens Call”:
  • Light as a Symbol of Mercy and Guidance: The recurring depiction of light radiating across the visuals evoked a sense of warmth, hope, and divine presence. This symbol resonated strongly with the poem’s focus on Allah’s constant love and accessibility.
  • Abstract Forms Evoking Prostration: Subtle, organic shapes resembling human gestures of worship brought an emotional and relatable aspect to the visuals. These forms encouraged viewers to reflect on their own spiritual connection to Allah.
  • Dynamic Colour Palette: The collection balanced soft, heavenly whites and golds with deeper blues and blacks, representing the duality of human struggles and divine mercy. This interplay of colours created a calming yet profound visual narrative, drawing viewers into contemplation.
  • The overall collection succeeded in amplifying the poem’s emotional and spiritual tone, inspiring viewers to engage with the concepts of devotion, divine love, and the eternal nature of Allah’s mercy

4. Open Problems

  • Several emerging open problems were identified:
  • Repetitive Visual Patterns: Many images featured similar compositions, such as hearts and water, leading to a lack of diversity across batches.
  • Surface-Level Metaphor: In some instances, the AI struggled to move beyond direct visual representations of nouns, limiting the exploration of deeper symbolic layers.
  • Stylistic Inconsistencies: While the cinematic kino preset provided a cohesive aesthetic, certain outputs deviated in style.
  • Literal Representations: In some cases, the visualizations leaned toward literal interpretations of light and connection, which constrained the potential for deeper abstraction or more complex symbolic layers.
  • Stylistic Homogeneity: The reliance on the cinematic kino preset provided aesthetic consistency but restricted the diversity of styles that could have enriched the collection.
  • Overly Literal Depictions: In some images, the representation of light and hearts leaned toward more literal interpretations, limiting the abstract depth that could have been achieved.
  • Limited Range of Abstraction: While the images embraced Abstract Expressionist principles, they sometimes lacked the complexity and dynamism characteristic of the style, resulting in simpler visual compositions.
  • Repetition of Motifs: Recurring symbols such as hearts and water, though effective, reduced the overall diversity of the collection. This repetition occasionally limited the exploration of deeper symbolic layers.
  • Surface-Level Representations: Some images adhered too closely to the literal meaning of the poetic nouns, such as a heart or water, instead of delving into more abstract or imaginative interpretations.
  • Stylistic Consistency: While the cinematic kino preset provided coherence, it constrained the variety of artistic styles. This occasionally resulted in outputs that lacked the dynamic variation characteristic of Abstract Expressionism.
  • The study[3] identified several challenges in using Leonardo AI, despite its benefits for developing teaching materials. Students faced technical issues, like difficulties in using the software and problems with it working well with their current systems. Additionally, a lack of proper training made it hard for students to use the technology effectively, suggesting that more support and better technical resources are necessary for successful implementation.
  • The authors[25] acknowledged some limitations in their studies on perceptions of art and artistic agency. They mention that using a single-item approach to measure these perceptions may not capture the complexity of the concepts and suggest that using more detailed scales could improve the research. Additionally, they believe that including insights from experts in the field, like artists or art historians, and using a mixed-methods approach that combines quantitative data with qualitative feedback could provide deeper understanding and valuable insights.
  • It is vital [26] how artificial intelligence (AI) can be used in music to challenge musicians by acting as an adversarial partner, pushing them to explore new styles and techniques. This generates some potential open problems, for example, by reducing the artist's influence in the creative process, we can discover new artistic expressions that are unfamiliar to humans. Additionally, it proposes creating interactive web-based art installations that can engage live audiences, emphasizing the continuous and tireless nature of machine-generated performances.
  • In[40], the authors assessed how well large language models (LLMs) perform using the Open LLM Leaderboard, which is a popular tool for comparing these models. However, they acknowledge that while this leaderboard is commonly used, the results it provides may not be perfect and could have some shortcomings. This means that the evaluations might not capture every aspect of the models' performance accurately.
  • Research in prompt engineering [41]is focused on improving how AI language models (LLMs) perform by using techniques like transfer learning and fine-tuning. Models such as BERT have set the stage for these advancements, which aim to make AI responses more controllable and understandable. This research emerges several open problems , which need to be solved to shape how AI systems communicate with users and handle complicated questions effectively.
  • The authors [59] acknowledged that their evaluation of "novelty" in poetry generation is limited because they only consider the surface-level similarities between the generated poems and existing ones, rather than the deeper meanings or semantics, leaving this as a still unresolved open problem, which needs addressing.
  • It is important [96] to solve the open problem on understanding how AI can create poetry and what this means for our understanding of poetry and human emotions. As AI becomes more integrated into creative fields[96], it is changing the job market by taking on more complex tasks, which raises questions about the nature of creativity and what it means to be human. While some fear that AI will replace creative jobs[96], evidence shows that it often enhances these roles, leading to new ways of working in the arts.
  • The undertaken research [99] has some emerging open problems. First, AI models can perform many different tasks[99], but they couldn't test all of them because of limited computing power and budget, so it's unclear if emotional influences would affect other tasks. Second, their method called EmotionDecode is based on a model of the human brain's reward system[99], but this is just one way to explain the findings, and more research is needed for a deeper understanding.
  • The study[127] has some limitations that should be kept in mind when looking at the results. First, it only analyzed four figures from the Figural Interpretation Quest (FIQ), which means it didn't cover all possible examples. Also[127], while the AI chatbot claims it hasn't seen any training data after September 2021, it's hard to confirm this, and there might be a chance it had some exposure to FIQ-related information, but this is unlikely to have greatly affected its performance. Lastly[127], the results varied depending on the different figures used in the study.
  • Between 2018 and 2021[147], the research utilized a human-action-recognition (HAR) algorithm and the OpenAI GPT-2 language model to create AI-enabled artworks. The HAR model was trained on a dataset containing various human actions, which has since been expanded[147], but the researchers chose not to retrain their model despite these updates. This offers a new open problem, yet not solved, namely, using newer versions of these models might enhance the creativity and effectiveness of the artworks, particularly by incorporating live human actions instead of pre-recorded materials.
  • This study [149] examined how AI tools, like ChatGPT, can enhance Aesthetic Mathematical Experiences (AME) for teachers, but it has some sophisticated unsolved open problems, for example, some of the claims made are speculative and not definitive, meaning more research is needed to confirm the findings. Additionally, the collaboration between teachers and ChatGPT raises questions about the originality of the content created, as it may be difficult to distinguish between human creativity and AI-generated responses.
  • Some open problems have emerged from [164].One issue[164] is that the model sometimes struggles to accurately create the requested context, which can happen if the training data didn't include enough examples of that context. Other problems [164] include changes in how the subject appears due to the context, overfitting to the original images, and difficulties with less common subjects, which can lead to inaccuracies or "hallucinations" in the generated images.

5. Conclusion, and Future Research Pathways

This study demonstrates the transformative potential of AI in Abstract Expressionist art, particularly in visualizing internal monologues in poetic form prompts like Ismail A Mageed’s . Through the integration of advanced techniques and symbolic abstraction, Leonardo AI proved capable of translating complex poetic themes into compelling visual narratives.
The recurring use of symbols such as hearts, water, and light showcased AI's ability to imbue artworks with deeper meaning, reflecting the emotional and existential undertones of the poem. While challenges persist in achieving greater diversity and abstraction, the collaboration between human creativity and machine intelligence reveals exciting possibilities for future artistic endeavours.
Future research could explore enhancing AI's symbolic interpretation capabilities, integrating adaptive models for greater stylistic variation, and examining applications in art therapy and education to further bridge the gap between textual and visual creativity.

References

  1. Jie, P. , Shan, X., & Chung, J. (2023). A Comparative Analysis Between<Leonardo. Ai> and<Meshy> as AI Texture Generation Tools. International Journal of Advanced Culture Technology, 11(4), 333-339.
  2. YILDIRIM, E. (2023). COMPARATIVE ANALYSIS OF LEONARDO AI, MIDJOURNEY, AND DALL-E: AI'S PERSPECTIVE ON FUTURE CITIES. URBANIZM: Journal of Urban Planning & Sustainable Development, (28).
  3. Nur, M. D. M. , & Hartati, H. (2024, August). Utilization of Leonardo AI in Developing Teaching Materials for Islamic Religious Education Students: Case Study at FTIK UIN Datokarama Palu. In Proceeding of International Conference on Islamic and Interdisciplinary Studies (Vol. 3; pp. 302–306.
  4. Mezei, P. From Leonardo to the Next Rembrandt – The Need for AI-Pessimism in the Age of Algorithms. UFITA 2020, 84, 390–429. [Google Scholar] [CrossRef]
  5. Giannakos, M.; Azevedo, R.; Brusilovsky, P.; Cukurova, M.; Dimitriadis, Y.; Hernandez-Leo, D.; Järvelä, S.; Mavrikis, M.; Rienties, B. The promise and challenges of generative AI in education. Behav. Inf. Technol. 2024, 1–27. [Google Scholar] [CrossRef]
  6. Kurdi, G.; Leo, J.; Parsia, B.; Sattler, U.; Al-Emari, S. A Systematic Review of Automatic Question Generation for Educational Purposes. Int. J. Artif. Intell. Educ. 2019, 30, 121–204. [Google Scholar] [CrossRef]
  7. Ruiz-Rojas, L.I.; Acosta-Vargas, P.; De-Moreta-Llovet, J.; Gonzalez-Rodriguez, M. Empowering Education with Generative Artificial Intelligence Tools: Approach with an Instructional Design Matrix. Sustainability 2023, 15, 11524. [Google Scholar] [CrossRef]
  8. Hsu, Y. C. , & Ching, Y. H. (2023). Generative artificial intelligence in education, part one: The dynamic frontier. TechTrends, 67(4), 603-607.
  9. Tabuenca, B.; Uche-Soria, M.; Greller, W.; Hernández-Leo, D.; Balcells-Falgueras, P.; Gloor, P.; Garbajosa, J. Greening smart learning environments with Artificial Intelligence of Things. Internet Things 2023, 25. [Google Scholar] [CrossRef]
  10. Fabila, J. , Campello, V. M., Martín-Isla, C., Obungoloch, J., Leo, K., Ronald, A., & Lekadir, K. (2024). Democratizing AI in Africa: FL for Low-Resource Edge Devices. arXiv:2408.17216.
  11. Ben-Tal, O. , Harris, M. T., & Sturm, B. L. (2021). How music AI is useful: engagements with composers, performers and audiences. Leonardo, 54(5), 510-516.
  12. Leo, L.A.; Viani, G.; Schlossbauer, S.; Bertola, S.; Valotta, A.; Crosio, S.; Pasini, M.; Caretta, A. Mitral Regurgitation Evaluation in Modern Echocardiography: Bridging Standard Techniques and Advanced Tools for Enhanced Assessment. Echocardiography 2024, 42, e70052. [Google Scholar] [CrossRef]
  13. Zeilinger, M. The Politics of Visual Indeterminacy in Abstract AI Art. Leonardo 2023, 56, 76–80. [Google Scholar] [CrossRef]
  14. Tatar, K.; Ericson, P.; Cotton, K.; Del Prado, P.T.N.; Batlle-Roca, R.; Cabrero-Daniel, B.; Ljungblad, S.; Diapoulis, G.; Hussain, J. A Shift in Artistic Practices through Artificial Intelligence. Leonardo 2024, 57, 293–297. [Google Scholar] [CrossRef]
  15. Hutson, J.; Schnellmann, A. The Poetry of Prompts: The Collaborative Role of Generative Artificial Intelligence in the Creation of Poetry and the Anxiety of Machine Influence. Glob. J. Comput. Sci. Technol. 2023, 23, 1–14. [Google Scholar] [CrossRef]
  16. Lima, E. AI art and public literacy: the miseducation of Ai-Da the robot. AI Ethic- 2024, 4, 841–854. [Google Scholar] [CrossRef]
  17. Finkley, I. (2024). By AI: Authorship, Literature, & Large Language Models.
  18. Bajohr, H. Operative ekphrasis: the collapse of the text/image distinction in multimodal AI. Word Image 2024, 40, 77–90. [Google Scholar] [CrossRef]
  19. Fontanella, F.; Colace, F.; Molinara, M.; Di Freca, A.S.; Stanco, F. Pattern recognition and artificial intelligence techniques for cultural heritage. Pattern Recognit. Lett. 2020, 138, 23–29. [Google Scholar] [CrossRef]
  20. Giannakos, M.; Azevedo, R.; Brusilovsky, P.; Cukurova, M.; Dimitriadis, Y.; Hernandez-Leo, D.; Järvelä, S.; Mavrikis, M.; Rienties, B. The promise and challenges of generative AI in education. Behav. Inf. Technol. 2024, 1–27. [Google Scholar] [CrossRef]
  21. Raj, M. , Berg, J., & Seamans, R. (2023). Art-ificial intelligence: The effect of AI disclosure on evaluations of creative content. arXiv:2303.06217.
  22. Sawicki, P. , Grzes, M., Góes, L. F., Brown, D., Peeperkorn, M., Khatun, A., & Paraskevopoulou, S. (2023). On the power of special-purpose GPT models to create and evaluate new poetry in old styles.
  23. Tsidylo, I. M. , & Sena, C. E. (2023). Artificial intelligence as a methodological innovation in the training of future designers: Midjourney tools. Information Technologies and Learning Tools, 97(5), 203.
  24. Srinivasan, R. , & Uchino, K. (2021). The role of arts in shaping AI ethics. In AAAI Workshop on reframing diversity in AI: Representation, inclusion, and power.
  25. Mikalonytė, E.S.; Kneer, M. Can Artificial Intelligence Make Art?: Folk Intuitions as to whether AI-driven Robots Can Be Viewed as Artists and Produce Art. ACM Trans. Human-Robot Interact. 2022, 11, 1–19. [Google Scholar] [CrossRef]
  26. Pošćić, A. , & Kreković, G. (2020). On the human role in generative art: a case study of AI-driven live coding. Journal of Science and Technology of the Arts, 12(3), 45-62.
  27. Cetinic, E.; She, J. Understanding and Creating Art with AI: Review and Outlook. ACM Trans. Multimedia Comput. Commun. Appl. 2022, 18, 1–22. [Google Scholar] [CrossRef]
  28. Huo, C.; Choi, D. Exploring Emotional Representation and Interpretation in AI-Generated Art. Asia-pacific J. Converg. Res. Interchang. 2024, 10, 533–546. [Google Scholar] [CrossRef]
  29. Demmer, T.R.; Kühnapfel, C.; Fingerhut, J.; Pelowski, M. Does an emotional connection to art really require a human artist? Emotion and intentionality responses to AI- versus human-created art and impact on aesthetic experience. Comput. Hum. Behav. 2023, 148. [Google Scholar] [CrossRef]
  30. Shen, Y.; Yu, F. The Influence of Artificial Intelligence on Art Design in the Digital Age. Sci. Program. 2021, 2021, 1–10. [Google Scholar] [CrossRef]
  31. Grba, D. Art Notions in the Age of (Mis)anthropic AI. Arts 2024, 13, 137. [Google Scholar] [CrossRef]
  32. Xu, X. A fuzzy control algorithm based on artificial intelligence for the fusion of traditional Chinese painting and AI painting. Sci. Rep. 2024, 14, 1–17. [Google Scholar] [CrossRef]
  33. Lovato, J.; Zimmerman, J.W.; Smith, I.; Dodds, P.; Karson, J.L. Foregrounding Artist Opinions: A Survey Study on Transparency, Ownership, and Fairness in AI Generative Art. Proc. AAAI/ACM Conf. AI, Ethic- Soc. 2024, 7, 905–916. [Google Scholar] [CrossRef]
  34. Chen, L.; Xiao, S.; Chen, Y.; Sun, L.; Childs, P.R.; Han, J. An artificial intelligence approach for interpreting creative combinational designs. J. Eng. Des. 2024, 1–28. [Google Scholar] [CrossRef]
  35. Lehikoinen, K.; Tuittila, S. Arts-based approaches for futures workshops: Creating and interpreting artistic futures images. Futur. Foresight Sci. 2024, 6. [Google Scholar] [CrossRef]
  36. Onyejelem, T. E. , & Aondover, E. M. (2024). Digital Generative Multimedia Tool Theory (DGMTT): A Theoretical Postulation in the Era of Artificial Intelligence. Adv Mach Lear Art Inte, 5(2), 01-09.
  37. Millet, K.; Buehler, F.; Du, G.; Kokkoris, M.D. Defending humankind: Anthropocentric bias in the appreciation of AI art. Comput. Hum. Behav. 2023, 143. [Google Scholar] [CrossRef]
  38. Ichien, N.; Stamenković, D.; Holyoak, K.J. Large Language Model Displays Emergent Ability to Interpret Novel Literary Metaphors. Metaphor. Symb. 2024, 39, 296–309. [Google Scholar] [CrossRef]
  39. Dang, H. , Mecke, L., Lehmann, F., Goller, S., & Buschek, D. (2022). How to prompt? Opportunities and challenges of zero-and few-shot learning for human-AI interaction in creative applications of generative models. arXiv:2209.01390.
  40. Wang, M. , Liu, Y., Liang, X., Huang, Y., Wang, D., Yang, X.,... & Zhang, Y. (2024). Minstrel: Structural Prompt Generation with Multi-Agents Coordination for Non-AI Experts. arXiv:2409.13449.
  41. Kulkarni, N. D. , & Tupsakhare, P. (2024). Crafting Effective Prompts: Enhancing AI Performance through Structured Input Design. JOURNAL OF RECENT TRENDS IN COMPUTER SCIENCE AND ENGINEERING (JRTCSE), 12(5), 1-10.
  42. Park, D.; An, G.-T.; Kamyod, C.; Kim, C.G. A Study on Performance Improvement of Prompt Engineering for Generative AI with a Large Language Model. J. Web Eng. 2024, 22, 1187–1206. [Google Scholar] [CrossRef]
  43. Muktadir, G. M. (2023). A brief history of prompt: Leveraging language models. arXiv:2310.04438.
  44. Park, J.; Choo, S. Generative AI Prompt Engineering for Educators: Practical Strategies. J. Spéc. Educ. Technol. [CrossRef]
  45. Korzynski, P.; Mazurek, G.; Krzypkowska, P.; Kurasinski, A. Artificial intelligence prompt engineering as a new digital competence: Analysis of generative AI technologies such as ChatGPT. Entrep. Bus. Econ. Rev. 2023, 11, 25–37. [Google Scholar] [CrossRef]
  46. Garg, A.; Rajendran, R. The Impact of Structured Prompt-Driven Generative AI on Learning Data Analysis in Engineering Students. 16th International Conference on Computer Supported Education. LOCATION OF CONFERENCE, FranceDATE OF CONFERENCE; pp. 270–277.
  47. Marvin, G. , Hellen, N., Jjingo, D., & Nakatumba-Nabende, J. (2023, June). Prompt engineering in large language models. In International conference on data intelligence and cognitive informatics (pp. 387-402). Singapore: Springer Nature Singapore.
  48. Wang, J.; Liu, Z.; Zhao, L.; Wu, Z.; Ma, C.; Yu, S.; Dai, H.; Yang, Q.; Liu, Y.; Zhang, S.; et al. Review of large vision models and visual prompt engineering. Meta-Radiology 2023, 1. [Google Scholar] [CrossRef]
  49. Giray, L. Prompt Engineering with ChatGPT: A Guide for Academic Writers. Ann. Biomed. Eng. 2023, 51, 2629–2633. [Google Scholar] [CrossRef] [PubMed]
  50. Denny, P. , Leinonen, J. , Prather, J., Luxton-Reilly, A., Amarouche, T., Becker, B. A., & Reeves, March). Prompt Problems: A new programming exercise for the generative AI era. In Proceedings of the 55th ACM Technical Symposium on Computer Science Education V. 1 (pp. 296-302)., B. N. (2024. [Google Scholar]
  51. Wong, M. , Rios, T., Menzel, S., & Ong, Y. S. (2024). Generative AI-based Prompt Evolution Engineering Design Optimization With Vision-Language Model. arXiv:2406.09143.
  52. Noel, G. P. (2024). Evaluating AI-powered text-to-image generators for anatomical illustration: A comparative study. Anatomical Sciences Education, 17(5), 979-983.
  53. Dogoulis, P. , Kordopatis-Zilos, G. , Kompatsiaris, I., & Papadopoulos, June). Improving synthetically generated image detection in cross-concept settings. In Proceedings of the 2nd ACM International Workshop on Multimedia AI against Disinformation (pp. 28-35)., S. (2023. [Google Scholar]
  54. Du, H.; Niyato, D.; Kang, J.; Xiong, Z.; Zhang, P.; Cui, S.; Shen, X.; Mao, S.; Han, Z.; Jamalipour, A.; et al. The Age of Generative AI and AI-Generated Everything. IEEE Netw. 2024, 38, 501–512. [Google Scholar] [CrossRef]
  55. Cao, Y. , Li, S., Liu, Y., Yan, Z., Dai, Y., Yu, P. S., & Sun, L. (2023). A comprehensive survey of ai-generated content (aigc): A history of generative ai from gan to chatgpt. arXiv:2303.04226.
  56. Ali, O.; Murray, P.A.; Momin, M.; Dwivedi, Y.K.; Malik, T. The effects of artificial intelligence applications in educational settings: Challenges and strategies. Technol. Forecast. Soc. Chang. 2023, 199. [Google Scholar] [CrossRef]
  57. Chen, S. Y. (2023). Generative AI, learning and new literacies. Journal of Educational Technology Development & Exchange, 16(2).
  58. Pocaluyko, J. (2022). Defining Desktop Films: From Spatial Interfaces to Algorithmic Cameras (Master's thesis).
  59. Wu, C. C. , Song, R., Sakai, T., Cheng, W. F., Xie, X., & Lin, S. D. (2019, September). Evaluating image-inspired poetry generation. In CCF international conference on natural language processing and chinese computing (pp. 539-551). Cham: Springer International Publishing.
  60. Köbis, N.; Mossink, L.D. Artificial intelligence versus Maya Angelou: Experimental evidence that people cannot differentiate AI-generated from human-written poetry. 114, 1065; 53. [Google Scholar] [CrossRef]
  61. Liu, B. , Fu, J. , Kato, M. P., & Yoshikawa, October). Beyond narrative description: Generating poetry from images by multi-adversarial training. In Proceedings of the 26th ACM international conference on Multimedia (pp. 783-791)., M. (2018. [Google Scholar]
  62. Hutson, J.; Schnellmann, A. The Poetry of Prompts: The Collaborative Role of Generative Artificial Intelligence in the Creation of Poetry and the Anxiety of Machine Influence. Glob. J. Comput. Sci. Technol. 2023, 23, 1–14. [Google Scholar] [CrossRef]
  63. Lyu, Y.; Wang, X.; Lin, R.; Wu, J. Communication in Human–AI Co-Creation: Perceptual Analysis of Paintings Generated by Text-to-Image System. Appl. Sci. 2022, 12, 11312. [Google Scholar] [CrossRef]
  64. Abu Zaid, R.M. Poetry between Human Mindset and Generative Artificial Intelligence: Some Relevant Applications and Implications. Bull. Fac. Lang. Transl. 2024, 27, 303–332. [Google Scholar] [CrossRef]
  65. Veremchuk, E. (2024). Can AI be a Poet? Comparative Analysis of Human-authored and AI-generated Poetry. Acta Neophilologica, 57(2), 112-125.
  66. Raj, M. , Berg, J., & Seamans, R. (2023). Art-ificial intelligence: The effect of AI disclosure on evaluations of creative content. arXiv:2303.06217.
  67. ARTIFICIAL ITELLIGENCE AND CREATIVITY: THE ROLE OF ARTIFICIAL INTELLIGENCE IN THE GENERATION OF MUSIC, ART AND LITERATURE. Наука і техніка сьoгoдні, (9 (23)).
  68. Elzohbi, M. , & Zhao, R. (2023). Creative data generation: A review focusing on text and poetry. arXiv:2305.08493.
  69. Oppenlaender, J. (2022, November). The creativity of text-to-image generation. In Proceedings of the 25th international academic mindtrek conference (pp. 192-202).
  70. Jiang, J. , Ling, Y., Li, B., Li, P., Piao, J., & Zhang, Y. (2024). Poetry2Image: An Iterative Correction Framework for Images Generated from Chinese Classical Poetry. arXiv:2407.06196.
  71. Chen, J.; Huang, K.; Zhu, X.; Qiu, X.; Wang, H.; Qin, X. Poetry4painting: Diversified poetry generation for large-size ancient paintings based on data augmentation. Comput. Graph. 2023, 116, 206–215. [Google Scholar] [CrossRef]
  72. Azuaje, G.; Liew, K.; Buening, R.; She, W.J.; Siriaraya, P.; Wakamiya, S.; Aramaki, E. Exploring the use of AI text-to-image generation to downregulate negative emotions in an expressive writing application. R. Soc. Open Sci. 2023, 10, 220238. [Google Scholar] [CrossRef] [PubMed]
  73. Gao, R.; Lin, Y.; Zhao, N.; Cai, Z.G. Machine translation of Chinese classical poetry: a comparison among ChatGPT, Google Translate, and DeepL Translator. Humanit. Soc. Sci. Commun. 2024, 11, 1–10. [Google Scholar] [CrossRef]
  74. Hongbo, L. , & Wenkai, F. (2024). AI Turn in Ethical Literary Criticism. Interdisciplinary Studies of Literature, 8(3).
  75. Beyan, E. V. P. , & Rossy, A. G. C. (2023). A review of AI image generator: influences, challenges, and future prospects for architectural field. Journal of Artificial Intelligence in Architecture, 2(1), 53-65.
  76. Ting, T.T.; Ling, L.Y.; Azam, A.I.B.A.; Palaniappan, R. Artificial intelligence art: Attitudes and perceptions toward human versus artificial intelligence artworks. TRANSPORT, ECOLOGY - SUSTAINABLE DEVELOPMENT: EKOVarna2022. LOCATION OF CONFERENCE, BulgariaDATE OF CONFERENCE;
  77. Di Dio, C.; Ardizzi, M.; Schieppati, S.V.; Massaro, D.; Gilli, G.; Gallese, V.; Marchetti, A. Art made by artificial intelligence: The effect of authorship on aesthetic judgments. Psychol. Aesthetics, Creativity, Arts. [CrossRef]
  78. Lee, Y. K. , Park, Y. H., & Hahn, S. (2023). A portrait of emotion: Empowering self-expression through AI-generated art. arXiv:2304.13324.
  79. Yang, J.; Zhang, H. Development And Challenges of Generative Artificial Intelligence in Education and Art. Highlights Sci. Eng. Technol. 2024, 85, 1334–1347. [Google Scholar] [CrossRef]
  80. Wang, Y. , Pan, Y., Yan, M., Su, Z., & Luan, T. H. (2023). A survey on ChatGPT: AI-generated contents, challenges, and solutions. IEEE Open Journal of the Computer Society.
  81. Agarwal, R. , & Kann, K. (2020). Acrostic poem generation. arXiv:2010.02239.
  82. Goloujeh, A.M.; Sullivan, A.; Magerko, B. Is It AI or Is It Me? Understanding Users’ Prompt Journey with Text-to-Image Generative AI Tools. CHI '24: CHI Conference on Human Factors in Computing Systems. LOCATION OF CONFERENCE, United StatesDATE OF CONFERENCE; pp. 1–13.
  83. Erskine, T. AI and the future of IR: Disentangling flesh-and-blood, institutional, and synthetic moral agency in world politics. Rev. Int. Stud. 2024, 50, 534–559. [Google Scholar] [CrossRef]
  84. Tsao, J.; Nogues, C. Beyond the author: Artificial intelligence, creative writing and intellectual emancipation. Poetics 2024, 102. [Google Scholar] [CrossRef]
  85. Messer, U. Co-creating art with generative artificial intelligence: Implications for artworks and artists. Comput. Hum. Behav. Artif. Humans 2024, 2. [Google Scholar] [CrossRef]
  86. Jiang, H. H. , Brown, L. , Cheng, J., Khan, M., Gupta, A., Workman, D.,... & Gebru, August). AI Art and its Impact on Artists. In Proceedings of the 2023 AAAI/ACM Conference on AI, Ethics, and Society (pp. 363-374)., T. (2023. [Google Scholar]
  87. Anscomb, C. AI: artistic collaborator? AI Soc. 2024, 1–11. [Google Scholar] [CrossRef]
  88. Hong, J. W. , & Curran, N. M. (2019). Artificial intelligence, artists, and art: attitudes toward artwork produced by humans vs. artificial intelligence. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 15(2s), 1-16.
  89. Lovato, J.; Zimmerman, J.W.; Smith, I.; Dodds, P.; Karson, J.L. Foregrounding Artist Opinions: A Survey Study on Transparency, Ownership, and Fairness in AI Generative Art. Proc. AAAI/ACM Conf. AI, Ethic- Soc. 2024, 7, 905–916. [Google Scholar] [CrossRef]
  90. Li, X. (2024, December). Breaking the shackles-the transformation of artistic creation under the background of AI images. In 2024 6th International Conference on Literature, Art and Human Development (ICLAHD 2024); Atlantis Press; pp. 295–301.
  91. Mikalonytė, E.S.; Kneer, M. Can Artificial Intelligence Make Art?: Folk Intuitions as to whether AI-driven Robots Can Be Viewed as Artists and Produce Art. ACM Trans. Human-Robot Interact. 2022, 11, 1–19. [Google Scholar] [CrossRef]
  92. Arenal, A. , Armuña, C., Aguado Terrón, J. M., Ramos, S., & Feijóo, C. (2025). AI Challenges in the Era of Music Streaming: an analysis from the perspective of creative artists and performers.
  93. Işık, V. EXPLORING ARTISTIC FRONTIERS IN THE ERA OF ARTIFICIAL INTELLIGENCE. 14. [CrossRef]
  94. McCarthy, C. (2023). AI Art and the Artistic Revolution.
  95. Yadav, M. , Kumar, M., Sahoo, A., & Rathnasiri, M. S. H. (2025). AI and the Evolution of Artistic Expression: Impacts on Society and Culture. In Transforming Cinema with Artificial Intelligence (pp. 15-36). IGI Global Scientific Publishing.
  96. Hutson, J.; Schnellmann, A. The Poetry of Prompts: The Collaborative Role of Generative Artificial Intelligence in the Creation of Poetry and the Anxiety of Machine Influence. Glob. J. Comput. Sci. Technol. 2023, 23, 1–14. [Google Scholar] [CrossRef]
  97. Rahmeh, H. Digital Verses Versus Inked Poetry: Exploring Readers’ Response to AI-Generated and Human-Authored Sonnets. Sch. Int. J. Linguistics Lit. 2023, 6, 372–382. [Google Scholar] [CrossRef]
  98. Shalevska, E. The Digital Laureate: Examining AI-Generated Poetry. RATE Issues 2024, 31. [Google Scholar] [CrossRef]
  99. Raj, M. , Berg, J., & Seamans, R. (2023). Art-ificial intelligence: The effect of AI disclosure on evaluations of creative content. arXiv:2303.06217.
  100. Li, C. , Wang, J., Zhang, Y., Zhu, K., Wang, X., Hou, W.,... & Xie, X. (2023). The good, the bad, and why: Unveiling emotions in generative ai. arXiv:2312.11111.
  101. Bechard, D. (2024). The Pen and the Processor: A Turing-like Test to Gauge GPT-Generated Poetry (Doctoral dissertation).
  102. Mukminin, M. S. , Putra, L. B. ( 1(02), 21–31.
  103. Atherton, D. D. (2024). Reading Medieval Literary Time, Emotion, and Intimacy: On Affective Temporality in an Era of Asynchronicity and Artificial Intelligence (Doctoral dissertation, The George Washington University).
  104. Karaban, V. , & Karaban, A. (2024, January). AI-translated poetry: Ivan Franko’s poems in GPT-3.5-driven machine and human-produced translations. In Forum for Linguistic Studies (Vol. 6, No. 1).
  105. Li, C. , Wang, J., Zhang, Y., Zhu, K., Hou, W., Lian, J.,... & Xie, X. (2023). Large language models understand and can be enhanced by emotional stimuli. arXiv:2307.11760.
  106. Giannini, T.; Bowen, J.P. Generative Art and Computational Imagination: Integrating poetry and art. Proceedings of EVA London 2023. LOCATION OF CONFERENCE, COUNTRYDATE OF CONFERENCE; pp. 211–219.
  107. Stiles, S. Ars Autopoetica: On Authorial Intelligence, Generative Literature, and the Future of Language. In Choreomata; Chapman and Hall/CRC, 2023; pp. 357–378. [Google Scholar]
  108. Grassini, S.; Koivisto, M. Understanding how personality traits, experiences, and attitudes shape negative bias toward AI-generated artworks. Sci. Rep. 2024, 14, 1–15. [Google Scholar] [CrossRef]
  109. Henrickson, L.; Meroño-Peñuela, A. Prompting meaning: a hermeneutic approach to optimising prompt engineering with ChatGPT. AI Soc. 2023, 39, 2647–2665. [Google Scholar] [CrossRef]
  110. Chesher, C. , & Albarrán-Torres, C. (2023). The emergence of autolography: the ‘magical’invocation of images from text through AI. Media International Australia, 189(1), 57-73.
  111. Amer, S. K. (2023). AI Imagery and the Overton Window. arXiv:2306.00080.
  112. Ervik, A. (2023). Generative AI and the collective imaginary: The technology-guided social imagination in AI-imagenesis.
  113. Carter, R. A. (2023). Machine visions: Mapping depictions of machine vision through critical image synthesis. Open Library of Humanities, 9(2), 10077.
  114. Rettberg, S. , Memmott, T., Rettberg, J. W., Nelson, J., & Lichty, P. (2023). AIwriting: Relations Between Image Generation and Digital Writing. arXiv:2305.10834.
  115. Zeilinger, M. The Politics of Visual Indeterminacy in Abstract AI Art. Leonardo 2023, 56, 76–80. [Google Scholar] [CrossRef]
  116. “Critical peak pricing-San Diego Gas & Electric,” https://www. sdge.com/businesses/savings-center/energy-management-programs/ demand-response/critical-peak-pricing, accessed: 2022-03-04.
  117. Notaro, A. (2020). State-of-the-art: AI through the (artificial) artist’s eye. EVA London 2020: Electronic Visualisation and the Arts, 322-328.
  118. Kalpokas, I. Work of art in the Age of Its AI Reproduction. Philos. Soc. Crit. 2023. [Google Scholar] [CrossRef]
  119. Roose, K. (2022). An AI-Generated Picture Won an Art Prize. Artists Arenʼt Happy.
  120. Park, S. The work of art in the age of generative AI: aura, liberation, and democratization. AI Soc. 2024, 1–10. [Google Scholar] [CrossRef]
  121. Rabinovich, M.; Foley, C. The work of art in the age of AI reproducibility. AI Soc. 2024, 1–3. [Google Scholar] [CrossRef]
  122. Gill, S. P. (2023). AI & society, knowledge, culture and communication. AI & SOCIETY, 38(5), 1809-1811.
  123. Gill, S.P. AI & society, knowledge, culture and communication. AI & SOCIETY 2023, 38, 1809–1811. [Google Scholar]
  124. Berryman, J. Creativity and Style in GAN and AI Art: Some Art-historical Reflections. Philos. Technol. 2024, 37, 1–17. [Google Scholar] [CrossRef]
  125. Oliveira, A.M. Future Imaginings in Art and Artificial Intelligence. J. Aesthet. Phenomenol. 2022, 9, 209–225. [Google Scholar] [CrossRef]
  126. Grba, D. Deep Else: A Critical Framework for AI Art. Digital 2022, 2, 1–32. [Google Scholar] [CrossRef]
  127. Grassini, S.; Koivisto, M. Artificial Creativity? Evaluating AI Against Human Performance in Creative Interpretation of Visual Stimuli. Int. J. Human–Computer Interact. 12. [CrossRef]
  128. Chen, L.; Xiao, S.; Chen, Y.; Sun, L.; Childs, P.R.; Han, J. An artificial intelligence approach for interpreting creative combinational designs. J. Eng. Des. 2024, 1–28. [Google Scholar] [CrossRef]
  129. Grilli, L.; Pedota, M. Creativity and artificial intelligence: A multilevel perspective. Creativity Innov. Manag. 2024, 33, 234–247. [Google Scholar] [CrossRef]
  130. Al-Zahrani, A. M. (2024). Balancing act: Exploring the interplay between human judgment and artificial intelligence in problem-solving, creativity, and decision-making. Igmin Research, 2(3), 145-158.
  131. Deshpande, M.; Park, J.; Pait, S.; Magerko, B. Perceptions of Interaction Dynamics in Co-Creative AI: A Comparative Study of Interaction Modalities in Drawcto. C&C '24: Creativity and Cognition. LOCATION OF CONFERENCE, United StatesDATE OF CONFERENCE; pp. 102–116.
  132. Chandrasekera, T.; Hosseini, Z.; Perera, U. Can artificial intelligence support creativity in early design processes? Int. J. Arch. Comput. 2024. [Google Scholar] [CrossRef]
  133. Nadeem, S. (2023). The Harmonious Dance of AI and Imagination: Crafting a New Era of Creativity. Journal of AI-Authored Articles and Imaginary Creations, 1(1), 1-4.
  134. De Miranda, L. (2020). Artificial intelligence and philosophical creativity: From analytics to crealectics. Human Affairs, 30(4), 597-607.
  135. Gegra, A. , & Maccagnola, F. (2021). Off to the future of creativity and innovation: what AI can-and cannot-do.
  136. Mukherjee, A. , & Chang, H. H. (2024). AI Knowledge and Reasoning: Emulating Expert Creativity in Scientific Research. arXiv:2404.04436.
  137. Ady, N. M. , & Rice, F. (2023). Interdisciplinary methods in computational creativity: How human variables shape human-inspired AI research. arXiv:2306.17070.
  138. Chen, Q. , Ho, Y. J. I., Sun, P., & Wang, D. (2024). The Philosopher’s Stone for Science–The Catalyst Change of AI for Scientific Creativity. Pin and Wang, Dashun, The Philosopher’s Stone for Science–The Catalyst Change of AI for Scientific Creativity (, 2024). 5 March.
  139. Moruzzi, C. (2021). On the Relevance of Understanding for Creativity. Philosophy after AI.
  140. Vinchon, F. , Gironnay, V., & Lubart, T. (2023). The Creative AI-Land: Exploring new forms of creativity.
  141. Monte-Serrat, D. M. , & Cattani, C. (2023). Towards ethical AI: Mathematics influences human behavior. Journal of Humanistic Mathematics, 13(2), 469-493.
  142. Zhenyuan, Y. , Zhengliang, L., Jing, Z., Cen, L., Jiaxin, T., Tianyang, Z.,... & Tianming, L. (2024). Analyzing nobel prize literature with large language models. arXiv:2410.18142.
  143. Benford, S. , Hazzard, A. , Vear, C., Webb, H., Chamberlain, A., Greenhalgh, C.,... & Marshall, July). Five Provocations for a More Creative TAS. In Proceedings of the First International Symposium on Trustworthy Autonomous Systems (pp. 1-10)., J. (2023. [Google Scholar]
  144. Natale, S.; Henrickson, L. The Lovelace effect: Perceptions of creativity in machines. New Media Soc. 2022, 26, 1909–1926. [Google Scholar] [CrossRef]
  145. Qamar, M. T. , Yasmeen, J., Pathak, S. K., Sohail, S. S., Madsen, D. Ø., & Rangarajan, M. (2024). Big claims, low outcomes: fact checking ChatGPT’s efficacy in handling linguistic creativity and ambiguity. Cogent Arts & Humanities, 11(1), 2353984.
  146. Schneider, J. , Sinem, K., & Stockhammer, D. (2024). Empowering Clients: Transformation of Design Processes Due to Generative AI. arXiv:2411.15061.
  147. Gilchrist, B. (2022). Poetics of Artificial Intelligence in Art Practice:(Mis) apprehended Bodies Remixed as Language (Doctoral dissertation, University of Sunderland).
  148. Strineholm, P. (2021). Exploring human-robot interaction through explainable AI poetry generation.
  149. da Silva, R. S. R. , & de Carvalho, A. C. B. (2024). The Creation of Mathematical Poems and Song Lyrics by (Pre-service) Teachers-with-AI as an Aesthetic Experience. Journal of Digital Life and Learning, 4(1), 43-63.
  150. Mikalonytė, E.S.; Kneer, M. Can Artificial Intelligence Make Art?: Folk Intuitions as to whether AI-driven Robots Can Be Viewed as Artists and Produce Art. ACM Trans. Human-Robot Interact. 2022, 11, 1–19. [Google Scholar] [CrossRef]
  151. Guo, A. , Sathyanarayanan, S., Wang, L., Heer, J., & Zhang, A. (2024). From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice. arXiv:2411.03137.
  152. Bajohr, H. Algorithmic Empathy: Toward a Critique of Aesthetic AI. Configurations 2022, 30, 203–231. [Google Scholar] [CrossRef]
  153. Arathdar, D. Literature, Narrativity and Composition in the age of Artificial Intelligence. Trans- 2021, 27. [Google Scholar] [CrossRef]
  154. Bajohr, H. Operative ekphrasis: the collapse of the text/image distinction in multimodal AI. Word Image 2024, 40, 77–90. [Google Scholar] [CrossRef]
  155. Edmond, C. (2019). Poetics of the machine: Machine writing and the AI literature frontier (Doctoral dissertation, Macquarie University).
  156. Chen, H.-C.; Chen, Z. Using ChatGPT and Midjourney to Generate Chinese Landscape Painting of Tang Poem ‘The Difficult Road to Shu’. Int. J. Soc. Sci. Artist. Innov. 2023, 3, 1–10. [Google Scholar] [CrossRef]
  157. Dainys, A. Human Creativity Versus Machine Creativity: Will Humans Be Surpassed by AI? 2024. [Google Scholar]
  158. Barranco, M. C. (2022). Artistic Beauty in the Face of Artificial Intelligence Art. Social and Technological Aspects of Art, 93.
  159. Zhang, R. , & Eger, S. (2024). LLM-based multi-agent poetry generation in non-cooperative environments. arXiv:2409.03659.
  160. Wiratno, T.A.; Callula, B. Transformation of Beauty in Digital Fine Arts Aesthetics: An Artpreneur Perspective. Aptisi Trans. Technopreneurship (ATT) 2024, 6, 231–241. [Google Scholar] [CrossRef]
  161. Chandrashekar, K. , & Jangampet, V. D. (2021). Enhancing Generative AI Precision: Adaptive Prompt Reinforcement Learning for High-Fidelity Applications. International Journal of Computer Engineering and Technology (IJCET), 12(1), 81-90.
  162. Dang, H. , Mecke, L., Lehmann, F., Goller, S., & Buschek, D. (2022). How to prompt? Opportunities and challenges of zero-and few-shot learning for human-AI interaction in creative applications of generative models. arXiv:2209.01390.
  163. Oppenlaender, J.; Linder, R.; Silvennoinen, J. Prompting AI Art: An Investigation into the Creative Skill of Prompt Engineering. Int. J. Human–Computer Interact. 23. [CrossRef]
  164. Ruiz, N. , Li, Y. , Jampani, V., Pritch, Y., Rubinstein, M., & Aberman, K. (2023). Dreambooth: Fine tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 22500-22510). [Google Scholar]
  165. Kharitonov, E.; Vincent, D.; Borsos, Z.; Marinier, R.; Girgin, S.; Pietquin, O.; Sharifi, M.; Tagliasacchi, M.; Zeghidour, N. Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision. Trans. Assoc. Comput. Linguistics 2023, 11, 1703–1718. [Google Scholar] [CrossRef]
  166. Li, D. , Li, J., & Hoi, S. (2024). Blip-diffusion: Pre-trained subject representation for controllable text-to-image generation and editing. Advances in Neural Information Processing Systems, 36.
  167. Ruiz, N. , Li, Y. , Jampani, V., Wei, W., Hou, T., Pritch, Y.,... & Aberman, K. (2024). Hyperdreambooth: Hypernetworks for fast personalization of text-to-image models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6527-6536). [Google Scholar]
  168. Po, R.; Yifan, W.; Golyanik, V.; Aberman, K.; Barron, J.T.; Bermano, A.; Chan, E.; Dekel, T.; Holynski, A.; Kanazawa, A.; et al. State of the Art on Diffusion Models for Visual Computing. Comput. Graph. Forum 2024, 43. [Google Scholar] [CrossRef]
  169. Ma, J.; Liang, J.; Chen, C.; Lu, H. Subject-Diffusion: Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning. SIGGRAPH '24: Special Interest Group on Computer Graphics and Interactive Techniques Conference. LOCATION OF CONFERENCE, United StatesDATE OF CONFERENCE; pp. 1–12.
  170. Zhang, Y., Tzun, T. T., Hern, L. W., & Kawaguchi, K. (2025). Enhancing semantic fidelity in text-to-image synthesis: Attention regulation in diffusion models. In European Conference on Computer Vision (pp. 70-86). Springer, Cham.
  171. Fan, F.; Luo, C.; Gao, W.; Zhan, J. AIGCBench: Comprehensive evaluation of image-to-video content generated by AI. BenchCouncil Trans. Benchmarks, Stand. Evaluations 2024, 3. [Google Scholar] [CrossRef]
  172. White, J. , Hays, S., Fu, Q., Spencer-Smith, J., & Schmidt, D. C. (2024). Chatgpt prompt patterns for improving code quality, refactoring, requirements elicitation, and software design. In Generative AI for Effective Software Development (pp. 71-108). Cham: Springer Nature Switzerland.
  173. Lee, U.; Jung, H.; Jeon, Y.; Sohn, Y.; Hwang, W.; Moon, J.; Kim, H. Few-shot is enough: exploring ChatGPT prompt engineering method for automatic question generation in english education. Educ. Inf. Technol. 2023, 29, 11483–11515. [Google Scholar] [CrossRef]
  174. Megahed, F. M. , Chen, Y. J., Ferris, J. A., Knoth, S., & Jones-Farmer, L. A. (2024). How generative AI models such as ChatGPT can be (mis) used in SPC practice, education, and research? An exploratory study. Quality Engineering, 36(2), 287-315.
  175. Ma, Z.; Jia, G.; Qi, B.; Zhou, B. Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking. MM '24: The 32nd ACM International Conference on Multimedia. LOCATION OF CONFERENCE, AustraliaDATE OF CONFERENCE; pp. 7113–7122.
Figure 1. Caption.
Figure 1. Caption.
Preprints 145172 g001
Figure 2. Caption.
Figure 2. Caption.
Preprints 145172 g002
Figure 3. Caption.
Figure 3. Caption.
Preprints 145172 g003
Figure 4. Caption.
Figure 4. Caption.
Preprints 145172 g004
Figure 5. Caption.
Figure 5. Caption.
Preprints 145172 g005
Figure 6. Caption.
Figure 6. Caption.
Preprints 145172 g006
Figure 7. Caption.
Figure 7. Caption.
Preprints 145172 g007
Figure 8. Caption.
Figure 8. Caption.
Preprints 145172 g008
Figure 9. Caption.
Figure 9. Caption.
Preprints 145172 g009
Figure 10. Caption.
Figure 10. Caption.
Preprints 145172 g010
Figure 11. Caption.
Figure 11. Caption.
Preprints 145172 g011
Figure 12. Caption.
Figure 12. Caption.
Preprints 145172 g012
Figure 13. Caption.
Figure 13. Caption.
Preprints 145172 g013
Figure 14. Caption.
Figure 14. Caption.
Preprints 145172 g014
Figure 15. Caption.
Figure 15. Caption.
Preprints 145172 g015
Figure 16. Caption.
Figure 16. Caption.
Preprints 145172 g016
Figure 17. Caption.
Figure 17. Caption.
Preprints 145172 g017
Figure 18. Caption.
Figure 18. Caption.
Preprints 145172 g018
Figure 19. Caption.
Figure 19. Caption.
Preprints 145172 g019
Figure 20. Caption.
Figure 20. Caption.
Preprints 145172 g020
Figure 21. Caption.
Figure 21. Caption.
Preprints 145172 g021
Figure 22. Caption.
Figure 22. Caption.
Preprints 145172 g022
Figure 23. Caption.
Figure 23. Caption.
Preprints 145172 g023
Figure 24. Caption.
Figure 24. Caption.
Preprints 145172 g024
Figure 25. Caption.
Figure 25. Caption.
Preprints 145172 g025
Figure 26. Caption.
Figure 26. Caption.
Preprints 145172 g026
Figure 27. Caption.
Figure 27. Caption.
Preprints 145172 g027
Figure 28. Caption.
Figure 28. Caption.
Preprints 145172 g028
Figure 29. Caption.
Figure 29. Caption.
Preprints 145172 g029
Figure 30. Caption.
Figure 30. Caption.
Preprints 145172 g030
Figure 31. Caption.
Figure 31. Caption.
Preprints 145172 g031
Figure 32. Caption.
Figure 32. Caption.
Preprints 145172 g032
Figure 33. Caption.
Figure 33. Caption.
Preprints 145172 g033
Figure 34. Caption.
Figure 34. Caption.
Preprints 145172 g034
Figure 35. Caption.
Figure 35. Caption.
Preprints 145172 g035
Figure 36. Caption.
Figure 36. Caption.
Preprints 145172 g036
Figure 37. Caption.
Figure 37. Caption.
Preprints 145172 g037
Figure 38. Caption.
Figure 38. Caption.
Preprints 145172 g038
Figure 39. Caption.
Figure 39. Caption.
Preprints 145172 g039
Figure 40. Caption.
Figure 40. Caption.
Preprints 145172 g010
Figure 41. Caption.
Figure 41. Caption.
Preprints 145172 g041
Figure 42. Caption.
Figure 42. Caption.
Preprints 145172 g042
Figure 43. Caption.
Figure 43. Caption.
Preprints 145172 g043
Figure 44. Caption.
Figure 44. Caption.
Preprints 145172 g044
Figure 45. Caption.
Figure 45. Caption.
Preprints 145172 g045
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

Disclaimer

Terms of Use

Privacy Policy

Privacy Settings

© 2025 MDPI (Basel, Switzerland) unless otherwise stated