Preprint Article Version 1 Preserved in Portico This version is not peer-reviewed

A Study on Generating Webtoons using Multilingual Text-to-Image Models

Version 1 : Received: 25 April 2023 / Approved: 26 April 2023 / Online: 26 April 2023 (03:16:07 CEST)

A peer-reviewed article of this Preprint also exists.

Yu, K.; Kim, H.; Kim, J.; Chun, C.; Kim, P. A Study on Generating Webtoons Using Multilingual Text-to-Image Models. Appl. Sci. 2023, 13, 7278. Yu, K.; Kim, H.; Kim, J.; Chun, C.; Kim, P. A Study on Generating Webtoons Using Multilingual Text-to-Image Models. Appl. Sci. 2023, 13, 7278.

Abstract

Recent advances in deep learning technology have led to increased interest in text-to-image technology, which enables computers to create images from text by simulating the human process of forming mental images. The GAN-based text-to-image technology involves the extraction of features from input text, which are combined with noise and then used as input to a GAN that generates images that are similar to the original images through competition between the generator and discriminator. Although generating images from English text is a mature area of research, text-to-image technology based on multilingualism, such as Korean, is still in its early stages of development. Webtoon is a digital comic format that allows comics to be viewed online. The creation process for webtoons is divided into story planning, content/sketching, coloring, and background drawing. Since each stage of webtoon production requires human intervention, it is both time-consuming and expensive. As a result, deep learning technologies such as automatic coloring and automatic line drawing are being used to reduce human involvement. However, there is a shortage of technology that can assist authors with story creation in webtoon production. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs Multilingual BERT to extract feature vectors for multiple languages, and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images that are similar to the original images when presented with multilingual input text after training.

Keywords

Multilingual BERT; Text-to-image; DCGAN; Webtoon; GAN

Subject

Computer Science and Mathematics, Artificial Intelligence and Machine Learning

Comments (0)

We encourage comments and feedback from a broad range of readers. See criteria for comments and our Diversity statement.

Leave a public comment
Send a private comment to the author(s)
* All users must log in before leaving a comment
Views 0
Downloads 0
Comments 0
Metrics 0


×
Alerts
Notify me about updates to this article or when a peer-reviewed version is published.
We use cookies on our website to ensure you get the best experience.
Read more about our cookies here.