Inicio  /  Applied Sciences  /  Vol: 13 Par: 12 (2023)  /  Artículo
ARTÍCULO
TITULO

A Study on Generating Webtoons Using Multilingual Text-to-Image Models

Kyungho Yu    
Hyoungju Kim    
Jeongin Kim    
Chanjun Chun and Pankoo Kim    

Resumen

Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise and used as input to a GAN, which generates images similar to the original images via competition between the generator and discriminator. Although images have been extensively generated from English text, text-to-image technology based on multilingualism, such as Korean, is in its developmental stage. Webtoons are digital comic formats for viewing comics online. The webtoon creation process involves story planning, content/sketching, coloring, and background drawing, all of which require human intervention, thus being time-consuming and expensive. Therefore, this study proposes a multilingual text-to-image model capable of generating webtoon images when presented with multilingual input text. The proposed model employs multilingual BERT to extract feature vectors for multiple languages and trains a DCGAN in conjunction with the images. The experimental results demonstrate that the model can generate images similar to the original images when presented with multilingual input text after training. The evaluation metrics further support these findings, as the generated images achieved an Inception score of 4.99 and an FID score of 22.21.

Palabras claves

 Artículos similares

       
 
Nisa Boukichou-Abdelkader, Miguel Ángel Montero-Alonso and Alberto Muñoz-García    
Recently, many methods and algorithms have been developed that can be quickly adapted to different situations within a population of interest, especially in the health sector. Success has been achieved by generating better models and higher-quality resul... ver más
Revista: Computation

 
Fahim Sufi    
GPT (Generative Pre-trained Transformer) represents advanced language models that have significantly reshaped the academic writing landscape. These sophisticated language models offer invaluable support throughout all phases of research work, facilitatin... ver más
Revista: Information

 
Hexin Lu, Xiaodong Zhu, Jingwei Cui and Haifeng Jiang    
The process of iris recognition can result in a decline in recognition performance when the resolution of the iris images is insufficient. In this study, a super-resolution model for iris images, namely SwinGIris, which combines the Swin Transformer and ... ver más
Revista: Algorithms

 
Tomasz Walczyna and Zbigniew Piotrowski    
The proliferation of ?Deep fake? technologies, particularly those facilitating face-swapping in images or videos, poses significant challenges and opportunities in digital media manipulation. Despite considerable advancements, existing methodologies ofte... ver más
Revista: Applied Sciences

 
Fang Gui, Jiaoyun Yang, Yiming Tang, Hongtu Chen and Ning An    
The life stories of older adults encapsulate an array of personal experiences that reflect their care needs. However, due to inherent fuzzy features, fragmented natures, repetition, and redundancies, the practical application of the life story approach p... ver más
Revista: Applied Sciences