Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

Automated Visual Generation using GAN with Textual Information Feeds

Version 1 : Received: 12 April 2024 / Approved: 15 April 2024 / Online: 15 April 2024 (10:46:33 CEST)

How to cite: Mathew, S.; Mani Mathew, M.; D, N.; S, S. Automated Visual Generation using GAN with Textual Information Feeds. Preprints 2024, 2024040940. https://doi.org/10.20944/preprints202404.0940.v1 Mathew, S.; Mani Mathew, M.; D, N.; S, S. Automated Visual Generation using GAN with Textual Information Feeds. Preprints 2024, 2024040940. https://doi.org/10.20944/preprints202404.0940.v1

Abstract

Visualising textual content could be helpful to professionals as well as amateurs across several fields. However, training a text-to-image generator in the mainstream domain requires large amounts of paired text-image and data, which is too expensive to collect since labeling millions of images and videos can be tiresome. GANs like StackGAN and StyleGAN can be considered as solutions to generate images from text. But the images generated may be of low accuracy and resolution, and the entire processing can be highly time-consuming. Moreover, image generation is a notion that is still being researched. Hence, the process of developing a Video Generation model necessitates substantial research. Despite the need for such a model, modern technology has lagged behind the solutions to this problem. This proposal suggests combining two methods, Text modification for Action Definition (TexAD) and SeQuential Image Generation for Video Synthesis (SQIGen). The proposed solution synthesises a sequence of images from textual information feeds and combines these images to create a video. TexAD uses Natural Language Processing and Deep Learning techniques to process, classify and modify text data. SQIGen is an extension of the VQGAN+CLIP neural network architecture that generates a sequence of images from the modified text data

Keywords

Visualization; Sequential Image Generation; GANs; Natural Language Processing and Deep Learning; TexAD; VQGAN+CLIP

Subject

Computer Science and Mathematics, Computer Vision and Graphics

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.