AI-Driven Cinematic Environment Generation, Procedural Worlds, and Film Production Workflows

Riham Hilal; Mazdak Zamani

doi:10.20944/preprints202606.1422.v1

Submitted:

12 June 2026

Posted:

18 June 2026

You are already at the latest version

Abstract

This manuscript examines the transformative role of artificial intelligence (AI) in cinematic environment generation and film production workflows, with a particular focus on pre-production and visual development processes. Traditionally, production design has relied on manual techniques such as script analysis, sketching, storyboarding, and physical or digital modeling to translate narrative concepts into spatial environments. However, recent advances in machine learning, generative models, and procedural modeling have introduced new paradigms that augment creative workflows. The study explores how AI-driven tools support key stages of filmmaking, including automated script interpretation, generative concept art creation, procedural environment modeling, terrain synthesis, and asset population. Through case studies such as Dune: Part Two and Avatar: The Way of Water, the manuscript highlights the integration of AI within visual effects pipelines, demonstrating its capacity to automate labor-intensive tasks such as rotoscoping, segmentation, and performance capture while preserving artistic control. Furthermore, the paper analyzes the balance between manual modeling and AI-assisted generation, emphasizing the emergence of hybrid workflows that combine computational efficiency with human creativity. Challenges related to visual coherence, narrative consistency, and artistic authorship are also discussed, alongside future directions involving interactive generative systems and real-time virtual production. The findings suggest that AI functions as a collaborative tool that enhances design exploration, accelerates production, and expands the possibilities of cinematic world-building while maintaining the essential role of the production designer in shaping visual storytelling.

Keywords:

artificial intelligence

;

cinematic environments

;

procedural modeling

;

generative AI

;

visual effects

;

virtual production

Subject:

Arts and Humanities - Film, Radio and Television

1. Introduction

Pre-production represents the conceptual and organizational phase of filmmaking in which creative decisions are translated into visual and technical plans. Traditionally, this stage has relied heavily on the expertise of production designers, art directors, illustrators, and concept artists who collaboratively translate the narrative into spatial environments. Through storyboards, sketches, mood boards, and scale models, the visual language of a film begins to emerge long before cameras are deployed, forming the foundation of cinematic spatial storytelling and visual narrative construction [1,2].

In recent years, artificial intelligence (AI) technologies have begun to influence this early stage of production. Machine learning algorithms, generative design systems, and data-driven visualization tools are now capable of assisting designers in developing visual concepts, exploring stylistic variations, and generating rapid iterations of cinematic environments. Advances in generative models and computational creativity have demonstrated how AI systems can support visual ideation processes in creative industries such as digital art, architecture, and film production [3]. Rather than replacing human creativity, these tools operate as collaborative systems that expand the designer’s capacity for exploration and experimentation.

Figure 1. AI-generated cinematic environment created using generative image models. Generative AI tools such as Midjourney and diffusion-based systems allow filmmakers and production designers to rapidly explore visual concepts and stylistic variations during the early stages of film pre-production. These tools can generate atmospheric urban environments and architectural compositions that support visual ideation and world-building. Photo source: Courtesy of OpenArt.

The integration of artificial intelligence into pre-production workflows introduces a new paradigm in which computational systems support tasks traditionally carried out manually by designers and artists. These tasks include automated script analysis, stylistic exploration, environment concept generation, and early-stage scene visualization. Recent research in computational creativity and generative design demonstrates that machine learning models can analyze large visual datasets and learn stylistic patterns that assist designers in producing alternative visual concepts and aesthetic variations. Such systems enable rapid exploration of design possibilities while allowing human creators to retain control over artistic direction and narrative interpretation [6]. Consequently, the contemporary art department increasingly operates within a hybrid workflow that combines human creative expertise with algorithmic assistance , (see Figure 2). [3]

As illustrated inFigure 3, Conceptual pipeline of AI-assisted cinematic environment generation in modern film production. Artificial intelligence tools support multiple stages of the production design process, including script analysis, generative concept art creation, procedural environment modeling, automated asset population, and virtual production workflows. These technologies enable filmmakers to rapidly prototype environments and integrate digital worlds into the visual effects pipeline.

Recent developments in artificial intelligence are not limited to concept generation and design exploration but also extend to visual effects production pipelines. Contemporary filmmaking increasingly integrates machine learning tools to automate complex visual effects tasks and accelerate production workflows.

1.1. AI-Assisted Visual Effects and Pre-Production Workflows: The Case of Dune: Part Two

Recent large-scale cinematic productions demonstrate how artificial intelligence is increasingly integrated into film production pipelines, particularly in visual effects development and environment creation. A prominent example is Dune: Part Two, where machine learning techniques were employed to automate complex visual effects tasks and accelerate production workflows. Contemporary visual effects pipelines involve processing vast volumes of image sequences, which traditionally required manual frame-by-frame operations such as rotoscoping, object segmentation, and compositing preparation. These tasks are highly labor-intensive and time-consuming when performed manually by visual effects artists [4]

Machine learning technologies now provide automated solutions for these processes by training neural networks to recognize visual patterns across image sequences. In rotoscoping workflows, AI systems are trained using pairs of input images and corresponding ground-truth masks that define the separation between foreground elements and background environments. Once trained, these neural networks can automatically generate segmentation masks for thousands of frames, enabling rapid isolation of characters, props, and environmental elements from the background. This automation significantly reduces manual processing time while preserving the precision required for high-quality cinematic compositing [4,5].

As illustrated in Figure 4. , machine learning pipelines used in visual effects workflows typically consist of three primary components: input frames extracted from film footage, ground-truth annotations created by artists, and neural network training processes that learn to map input images to accurate segmentation outputs. Through iterative training, the neural model gradually improves its ability to distinguish foreground objects from background environments. These AI-driven rotoscoping systems allow compositing artists to process large datasets of film footage more efficiently, enabling faster integration of digital environments and visual effects elements [4].

In the production of Dune: Part Two, such machine learning tools were integrated into the broader visual effects pipeline to handle complex compositing operations across thousands of frames. These systems enabled visual effects teams to automate technical tasks while preserving the artistic control required for cinematic storytelling. By reducing the time required for repetitive manual processes, machine learning technologies allow artists and designers to focus more on creative decisions related to lighting, atmosphere, and environmental design. Consequently, artificial intelligence increasingly functions as a collaborative tool that enhances the efficiency of visual effects production while supporting the creative vision of filmmakers [4,5].

As a result, the workflow of the art department is gradually shifting toward a hybrid model where human creativity and algorithmic processes coexist.

This chapter examines how AI technologies are transforming the early design stages of filmmaking. It explores the role of generative models in visual ideation, the use of machine learning for script interpretation, and the emergence of AI-assisted tools for concept art generation. By analyzing these developments, the chapter aims to demonstrate how artificial intelligence enhances creative workflows while preserving the essential artistic role of production designers.

1.2. AI-Driven Performance Capture and Virtual Production: The Case of Avatar: The Way of Water

Another significant example of the integration of artificial intelligence and advanced computational technologies in contemporary filmmaking is Avatar: The Way of Water. Directed by James Cameron, the film represents one of the most technologically complex cinematic productions, combining performance capture, virtual cinematography, and advanced visual effects pipelines to construct highly immersive digital environments. The production involved thousands of visual effects shots and extensive digital character animation, demonstrating the scale at which modern computational tools are integrated into film production workflows [6].

Artificial intelligence and machine learning techniques were employed within the visual effects pipeline to enhance character animation, facial performance capture, and environmental simulation. In particular, deep learning methods were used to support animation systems that model facial muscle movements and expressions, enabling animators to capture subtle emotional performances from actors and translate them into realistic digital characters (Figure 5). These neural-network-based systems help manage the complex relationship between facial structures and expressions, allowing animators to refine performances more efficiently while maintaining artistic control over character behavior [7].

In addition to character animation, AI-driven techniques contribute to the simulation of complex natural phenomena such as water dynamics, vegetation movement, and environmental lighting. The underwater environments that dominate the narrative of Avatar: The Way of Water required sophisticated simulation systems capable of modeling fluid dynamics, light scattering, and interactions between digital characters and water surfaces. These computational tools enable filmmakers to generate highly realistic natural environments while preserving the flexibility required for artistic experimentation and cinematic staging [6,8].

The production of Avatar: The Way of Water therefore illustrates how artificial intelligence and advanced computational technologies are transforming contemporary filmmaking workflows. By integrating machine learning algorithms, simulation systems, and virtual production tools, filmmakers can construct detailed cinematic worlds while maintaining real-time interaction between directors, actors, and digital environments. This approach reflects the broader transformation of modern film production toward hybrid pipelines in which human creativity is enhanced by intelligent computational systems [7,8].

2. Traditional Pre-Production Design Workflow

Before the introduction of AI-driven tools, the visual development process in filmmaking followed a structured sequence within the art department. The production designer typically collaborates closely with the director to establish the visual tone and stylistic identity of the film. This process begins with script analysis, where key narrative elements such as locations, historical periods, atmosphere, and symbolic motifs are identified in order to guide the spatial and visual interpretation of the story. Production design therefore functions as a critical narrative component, translating written scripts into cinematic environments that communicate mood, symbolism, and thematic meaning within the visual language of film [1,9].

Following script breakdown, concept artists develop sketches and digital illustrations that explore potential visual directions for the film. These concept images communicate architectural forms, lighting conditions, textures, and spatial composition that define the aesthetic identity of cinematic spaces. During this stage, visual references are frequently collected through mood boards and curated imagery to establish stylistic influences derived from architecture, photography, painting, and historical design traditions. Such visual exploration enables the creative team to align artistic intentions and maintain visual coherence across departments during production planning, similar to architectural visualization practices where digital modeling and visualization tools are used to explore spatial qualities such as form, materials, and lighting during the design process [10].

Physical or digital models may then be constructed to visualize the spatial organization of sets. These models enable directors and cinematographers to evaluate camera placement, actor blocking, and lighting strategies before physical construction begins. In contemporary filmmaking workflows, previsualization tools and virtual environment simulations allow filmmakers to explore scene composition, camera movement, and spatial relationships within digital scenes. Such systems support the design of camera viewpoints and trajectories while ensuring that visual constraints such as framing, visibility, and screen composition are satisfied. By simulating these cinematographic parameters within a virtual environment, filmmakers can test staging decisions and anticipate technical constraints prior to principal photography, thereby improving planning and coordination during production [11].

3. Procedural Modeling in Cinematic Environment Design

Procedural modeling has emerged as an important technique for creating complex cinematic environments through algorithmic processes rather than manual modeling. In computer graphics, procedural modeling refers to the generation of three-dimensional geometry and textures using sets of rules, parameters, or mathematical functions that automatically produce visual structures. This approach enables artists and designers to generate large-scale environments—such as cities, landscapes, and architectural structures—while maintaining consistency and scalability across the scene. Because many environmental elements in virtual worlds exhibit repetitive or self-similar patterns, procedural techniques can efficiently reproduce these structures through algorithms such as shape grammar, fractals, and L-systems, allowing complex forms to be generated from relatively simple rule sets [12,13].

In cinematic production pipelines, procedural modeling is particularly valuable when creating expansive digital environments that would otherwise require extensive modeling time and large teams of artists. For example, procedural city-generation systems can automatically produce street networks, building layouts, and architectural variations that resemble realistic urban structures while maintaining visual diversity. Such techniques have been widely explored in computer graphics research, where algorithms generate complex cityscapes by defining rules for streets, building forms, and spatial organization [12,14].

These systems enable rapid generation of believable urban environments while preserving artistic control through adjustable parameters. Consequently, procedural modeling provides filmmakers and production designers with a powerful tool for constructing scalable virtual worlds that maintain visual realism and stylistic coherence in cinematic scenes [13,15].

4. Generative Landscapes and AI-Based Terrain Synthesis

Generative landscape modeling has become an important area of research in computer graphics and artificial intelligence, enabling the automatic creation of realistic terrain and environmental structures through computational methods. Traditional terrain modeling techniques relied on mathematical algorithms such as fractal noise, midpoint displacement, and height-map synthesis to generate mountains, valleys, and other geographical features. These approaches allowed designers to create large digital terrains efficiently while maintaining natural irregularities that resemble real-world landscapes. However, while procedural terrain algorithms can generate visually complex environments, they often lack the contextual realism and environmental coherence that characterize natural geographical formations [16,17].

Recent developments in artificial intelligence have introduced new methods for terrain synthesis based on machine learning models trained on real-world geographical data. Deep generative models, particularly Generative Adversarial Networks (GANs), have demonstrated the ability to learn statistical patterns of natural landscapes and generate realistic terrain structures from learned distributions. By analyzing satellite imagery and digital elevation maps, these models can synthesize terrain features such as rivers, mountain ranges, and vegetation patterns while preserving spatial continuity and geological plausibility. Research has shown that GAN-based terrain generation techniques can produce highly detailed height maps and environmental textures that closely resemble real geographic formations, offering a powerful alternative to purely procedural terrain generation methods [18,19].

In cinematic and virtual production workflows, AI-based terrain synthesis allows filmmakers and environment artists to rapidly generate expansive landscapes for digital worlds. These systems can automatically create large natural environments that support cinematic staging, camera movement, and atmospheric lighting conditions while maintaining visual realism. By integrating generative terrain models with real-time rendering engines and virtual production tools, filmmakers can explore and modify landscapes dynamically during the previsualization process. Consequently, AI-driven terrain synthesis plays an increasingly significant role in the creation of scalable and immersive cinematic environments within modern digital production pipelines [17,19].

5. Automated Asset Population and Environmental Detail

Automated asset population plays a critical role in the creation of detailed digital environments by distributing large numbers of objects within virtual scenes according to algorithmic rules. In computer graphics and virtual environment design, asset population refers to the placement of environmental elements such as vegetation, buildings, street furniture, vehicles, and background props that collectively contribute to the realism of a scene. Traditionally, these elements were positioned manually by environmental artists, a process that could be extremely time-consuming when constructing large-scale digital environments. Automated placement systems address this challenge by applying procedural rules and spatial constraints that determine where and how assets should be distributed within the environment, thereby significantly accelerating the world-building process [14,20].

Procedural distribution algorithms often rely on spatial analysis of terrain geometry, surface orientation, environmental conditions, and proximity relationships between objects. For example, vegetation generation systems may distribute trees and plants based on elevation, soil conditions, slope, and sunlight exposure, producing ecosystems that resemble natural growth patterns. Similarly, urban procedural systems can automatically populate cities with buildings, vehicles, and infrastructure elements based on street networks and zoning patterns. Research in procedural modeling has demonstrated how rule-based systems and shape grammars can generate large and visually coherent environments while maintaining variability and structural realism across scenes [12,14].

Recent advances in artificial intelligence further enhance automated asset population through machine learning techniques that learn spatial relationships from real-world datasets. By analyzing images, geographic data, or existing urban layouts, AI models can infer how objects typically appear together within natural or urban environments. These learned patterns allow generative systems to place environmental assets in ways that preserve realism and spatial coherence. Consequently, automated asset population has become an essential component of modern digital production pipelines, enabling filmmakers and virtual environment designers to construct richly detailed worlds efficiently while maintaining visual consistency and cinematic believability [20,21].

6. Manual Modeling Versus AI-Assisted Generation

Traditional digital environment creation in film production and computer graphics has long relied on manual modeling techniques in which artists construct three-dimensional objects and environments using specialized modeling software. In this workflow, designers directly manipulate geometry, textures, and lighting parameters to create detailed virtual environments that reflect the intended artistic vision. Manual modeling provides a high degree of artistic control, allowing designers to carefully craft spatial composition, architectural features, and environmental details that align with narrative requirements and visual storytelling objectives. However, the creation of large-scale environments through manual methods can be time-consuming and labor-intensive, particularly when scenes require extensive urban landscapes, complex natural environments, or highly detailed background elements [22,23].

The introduction of artificial intelligence and generative modeling techniques has significantly transformed digital content creation workflows by enabling automated or semi-automated environment generation. AI-assisted systems use machine learning models to generate geometry, textures, and structural patterns based on large training datasets. These generative approaches allow designers to rapidly explore multiple design alternatives and produce large volumes of visual content that would otherwise require extensive manual effort. Research in generative adversarial networks and neural rendering has demonstrated how machine learning models can synthesize realistic visual content, support creative exploration while accelerate production processes [18,24].

Despite these technological advances, AI-assisted generation is typically integrated into production pipelines as a complementary tool rather than a replacement for human designers. Manual modeling remains essential for defining artistic direction, narrative symbolism, and stylistic coherence within cinematic environments. Consequently, contemporary digital production workflows often adopt hybrid approaches in which AI tools generate initial environment structures or variations, while artists refine and adjust the results to meet aesthetic and narrative goals. This collaborative relationship between human creativity and computational systems enables filmmakers and environment designers to balance efficiency with artistic control in the creation of complex cinematic worlds [23,24].

7. Continuity, Visual Coherence, and Aesthetic Control

Maintaining continuity and visual coherence is a fundamental requirement in cinematic environment design, as visual consistency plays a crucial role in preserving narrative immersion and spatial credibility within a film. In production design, continuity ensures that environments remain visually consistent across multiple scenes, camera angles, and shooting sessions, allowing audiences to perceive the cinematic world as a unified and believable space. Production designers and art directors traditionally establish visual guidelines that define architectural styles, color palettes, lighting atmospheres, and material properties to ensure that all elements of the set design align with the film’s narrative tone and aesthetic identity [9,25].

With the increasing integration of digital environments and AI-assisted generation, maintaining visual coherence across procedurally generated or automatically synthesized assets presents additional challenges. Generative systems may produce large volumes of environmental elements, but without appropriate constraints these elements may vary in style, scale, or spatial composition. Research in computer graphics and procedural modeling therefore emphasizes the use of rule-based constraints, style parameters, and hierarchical modeling frameworks that guide automated generation while preserving artistic consistency. Such techniques allow designers to embed stylistic rules within procedural systems so that generated environments adhere to predefined visual languages and production design principles [14,26].

In contemporary virtual production pipelines, aesthetic control is often maintained through hybrid workflows that combine algorithmic generation with human artistic supervision. Designers may use AI tools to generate initial environment layouts or asset variations while retaining the ability to modify materials, lighting, and composition to ensure narrative coherence. By integrating procedural generation systems with real-time rendering platforms and digital art direction tools, filmmakers can maintain visual continuity across scenes while benefiting from the efficiency of automated environment creation. Consequently, effective control of visual coherence remains a central challenge and objective in the development of AI-assisted cinematic environment design workflows [25,26].

8. Artistic Challenges and Future Directions

Despite the significant advantages offered by artificial intelligence in digital environment generation, several artistic and technical challenges remain in integrating these technologies into cinematic production workflows. One of the primary concerns involves maintaining the creative authorship of production designers and artists when automated systems are used to generate visual content. While AI-based tools can rapidly produce large quantities of visual variations, they may lack the intentional narrative symbolism, cultural references, and aesthetic sensitivity that human designers contribute to the filmmaking process. Research in computational creativity suggests that AI systems should be considered collaborative tools that support human creativity rather than replacements for artistic decision-making [27,28].

Another important challenge involves ensuring that generative systems maintain stylistic consistency and narrative coherence within cinematic environments. Machine learning models often generate content based on statistical patterns learned from large datasets, which may not always align with the specific artistic direction of a film. As a result, designers must guide and refine AI-generated outputs through human supervision, parameter control, and iterative refinement processes. Studies in creative AI and generative design highlight the importance of human–AI collaboration, where artists interact with generative systems to steer the design process and ensure that generated environments conform to narrative and aesthetic requirements [3,29].

Future developments in AI-driven environment generation are expected to focus on improving the interaction between designers and generative systems. Emerging research explores interactive generative tools that allow artists to guide procedural or AI-based generation through intuitive interfaces, enabling real-time exploration of environment variations while preserving artistic control. Additionally, advances in real-time rendering, neural scene representation, and virtual production technologies are likely to integrate AI-generated environments directly into filmmaking pipelines. These developments may allow directors and production designers to explore procedurally generated cinematic worlds during live production, expanding the possibilities of visual storytelling and immersive cinematic design [29,30].

9. Conclusion

Artificial intelligence (AI) and procedural modeling techniques are, to a great extent, revolutionizing the development of cinematic environments, allowing filmmakers to create massive digital worlds more efficiently and with higher scalability.
Procedural modeling can provide very strong tools for the development of complicated architectural forms, cityscapes, and interior designs by means of algorithmic rules as opposed to manual modeling.
AI-driven landform synthesis and creative landscaping methods facilitate the formation of highly lifelike natural surroundings because of their ability to recognize patterns from real-world geographic data and satellite images.
Digitally enhanced environments are further improved through automated asset population, which includes the spreading of items like plants, buildings, and other props based on spatial rules and environmentally learned relationships.
However, despite such technological accomplishments, the use of human modeling still is necessary to establish the core elements of visual storytelling, the hidden language of symbolism, and the unique style of film locations.
Hybrid workflows that combine human creativity with AI-assisted generation are coming up as the best ways of balancing the speed of computational efficiency with the freedom of artistic expression.
A challenge that remains significant is to preserve story continuity, visual smoothness, and the style-consistency even while employing auto-generation systems in film production.
Some of the upcoming features of filmmaker AI-toolset are likely to include interactive generative capabilities, instant environment building, and a stronger bond with virtual production equipment.
In the end, instead of fearing the replacement of human artists, we should consider AI as a helping hand that not only increases the production designer and filmmaker’s work capacities but also keeps the human artist’s immortal role through the storyline alive.

Funding

This research received no external funding and was conducted without financial support from any funding agency, organization, or institution.

References

S. T. McClean, Digital storytelling: The narrative power of visual effects in film, Cambridge, Massachusetts: Mit Press, 2007.
D. Bordwell, “Intensified continuity visual style in contemporary American film.,” Film quarterly, vol. 55, no. 3, pp. 16-28, 2002.
A. Elgammal, B. Liu, M. Elhoseiny and M. Mazzone, “Can: Creative adversarial networks, generating” art” by learning about styles and deviating from style norms.,” arXiv:1706.07068v1 , Cheongju City, 2017.
A. Cherbetji, “The Untapped Potential of Machine Learning in VFX,” Foundry, 3 September 2025. [Online]. Available: https://www.foundry.com/insights/machine-learning/untapped-potential-ml-vfx. [Accessed 13 March 2026].
Life Designer, “AI Behind the Scenes of Dune 2: How Technology Powers Cinematic Art,” LifeDesigner Blog, 10 June 2025. [Online]. Available: https://www.lifedesigner.io/blog/ai-behind-the-scenes-of-dune-2-how-tech-powers-cinematic-art. [Accessed 13 March 2026].
Wētā FX, “Our Work on Avatar: The Way of Water,” Wētā FX, January 2023. [Online]. Available: https://www.wetafx.co.nz/articles/our-work-on-avatar-the-way-of-water. [Accessed 13 March 2026].
C. McGowan, “Putting the AI in the Animation Industry,” VFX Voice, January 2024. [Online]. Available: https://vfxvoice.com/putting-the-ai-in-the-animation-industry/. [Accessed 13 March 2026].
TAICCA, “Digital Technology Revolution in Film and Television,” TAICCA, January 2024. [Online]. Available: https://culturetech.taicca.tw/en/resources/digital-technology-revolution-2024. [Accessed 13 March 2026].
V. LoBrutto, The filmmaker’s guide to production design, Simon and Schuster., 2002.
C. O’Coill and M. Doughty, “Computer game technology as a tool for participatory design,” in Conference on Education and Research in Computer Aided Architectural Design in Europe, Copenhagen, Denmark, 2004.
C. Lino and M. Christie, “Efficient composition for virtual camera control.,” in ACM SIGGRAPH/Eurographics Symposium on Computer Animation., 2012.
Y. I. Parish and P. Müller, “Procedural modeling of cities.,” in Proceedings of the 28th annual conference on Computer graphics and interactive techniques (pp. 301-308)., 2001.
J. Freiknecht and W. Effelsberg, “A survey on the procedural generation of virtual worlds,” Multimodal Technologies and Interaction,, vol. 1, no. 4, p. 27, 2017.
P. Müller, P. Wonka, S. Haegler, A. Ulmer and L. Van Gool, “Procedural modeling of buildings,” ACM SIGGRAPH, pp. 614-623, 2006.
T. Kelly and P. Wonka, “Interactive architectural modeling with procedural extrusions,” ACM Transactions on Graphics (TOG), vol. 30, no. 2, pp. 1-15, 2011.
F. K. Musgrave, C. E. Kolb and R. S. Mace, “The synthesis and rendering of eroded fractal terrains.,” ACM Siggraph Computer Graphics,, vol. 23, no. 3, pp. 41-50, 1989.
E. Galin, A. Peytavie, N. Maréchal and E. Guérin, “Procedural generation of roads,” Computer Graphics Forum , vol. 29, no. 2, pp. 429-438, 2010.
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, “Generative adversarial nets.,” in Advances in neural information processing systems, 27, 2014.
J. D. Génevaux, É. Galin, E. Guérin, A. Peytavie and B. Benes, “Terrain generation using procedural models based on hydrology,” ACM Transactions on Graphics (TOG), vol. 32, no. 4, pp. 1-13, 2013.
O. Deussen, P. Hanrahan, B. Lintermann, R. Měch, M. Pharr and P. Prusinkiewicz, “Realistic modeling and rendering of plant ecosystems.,” in he 25th annual conference on Computer graphics and interactive techniques (pp. 275-286)., 1998.
A. Dosovitskiy, J. Tobias Springenberg and T. Brox, “Learning to generate chairs with convolutional neural networks,” in the IEEE conference on computer vision and pattern recognition (pp. 1538-1546)., 2015.
T. Akenine-Moller, E. Haines and N. Hoffman, Real-time rendering, AK Peters/crc Press., 2019.
P. Shirley, M. Ashikhmin and S. Marschner, Fundamentals of computer graphics, AK Peters/CRC Press, 2009.
R. Zhang, P. Isola, A. A. Efros, E. Shechtman and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in the IEEE conference on computer vision and pattern recognition (pp. 586-595)., Salt Lake City, Utah, USA. , 2018.
B. Block, The visual story: Creating the visual structure of film, TV, and digital media, Routledge., 2020.
K. Perlin, “An image synthesizer.,” ACM Siggraph Computer Graphics, vol. 19, no. 3, pp. 287-296, 1985.
. M. A. Boden, “Creativity and artificial intelligence.,” Artificial intelligence, vol. 103, no. 1-2, pp. 347-356, 1998.
A. Elgammal, B. Liu, M. Elhoseiny and M. Mazzone, “Can: Creative adversarial networks, generating” art” by learning about styles and deviating from style norms,” arXiv preprint arXiv:1706.07068., 2017.
J. Schmidhuber, “Deep learning in neural networks: An overview.,” Neural networks, vol. 61, pp. 85-117, 2015.
E. R. Chan, C. Lin, M. Chan, K. Nagano, B. Pan, S. De Mello, O. Gallo, L. Guibas, J. Tremblay, S. Khamis and T. Karras, “Efficient geometry-aware 3d generative adversarial networks.,” in the IEEE/CVF conference on computer vision and pattern recognition (pp. 16123-16133)., 2022.

Figure 2. AI-powered pre-production workflow showing the transition from manual design processes to a hybrid model where AI assists in script analysis, concept generation, aesthetic exploration, and early visual development in film production. Source: Authors’ own illustration.

Figure 3. AI-driven film production timeline illustrating the integration of generative AI, procedural modeling, and machine learning across pre-production, production, and post-production workflows. Source: Authors’ own illustration.

Figure 4. Example of a machine-learning-based rotoscoping workflow used in visual effects production. The interface illustrates a neural-network training pipeline in which input frames are paired with ground-truth segmentation masks to train a model capable of isolating foreground elements from background scenes. Such AI-driven systems reduce the manual effort required for frame-by-frame rotoscoping and compositing in large-scale productions such as Dune: Part Two. Source: Warner Bros. Pictures and Legendary Pictures.

Figure 5. AI-assisted digital character production pipeline used in Avatar: The Way of Water. Advanced performance capture and machine learning techniques enable animators to translate human actor performances into realistic digital characters. Such computational systems help model facial expressions, body movements, and environmental interactions in large-scale virtual productions. Source: Wētā FX.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.