|
|
|
Nguyen Trung Tuan, Philip Moore, Dat Ha Vu Thanh and Hai Van Pham
ChatGPT plays significant roles in the third decade of the 21st Century. Smart cities applications can be integrated with ChatGPT in various fields. This research proposes an approach for developing large language models using generative artificial intel...
ver más
|
|
|
|
|
|
|
Fenfang Li, Zhengzhang Zhao, Li Wang and Han Deng
Sentence Boundary Disambiguation (SBD) is crucial for building datasets for tasks such as machine translation, syntactic analysis, and semantic analysis. Currently, most automatic sentence segmentation in Tibetan adopts the methods of rule-based and stat...
ver más
|
|
|
|
|
|
|
Khalil Al-Hussaeni, Mohamed Sameer and Ioannis Karamitsos
Due to the increasing reliance on social network platforms in recent years, hate speech has risen significantly among online users. Government and social media platforms face the challenging responsibility of controlling, detecting, and removing massivel...
ver más
|
|
|
|
|
|
|
Dezhi Cao, Yue Zhao and Licheng Wu
The construction of pronunciation dictionaries relies on high-quality and extensive training data in data-driven way. However, the manual annotation of corpus for this purpose is both costly and time consuming, especially for low-resource languages that ...
ver más
|
|
|
|
|
|
|
Mikel Penagarikano, Amparo Varona, Germán Bordel and Luis Javier Rodriguez-Fuentes
In this paper, a semisupervised speech data extraction method is presented and applied to create a new dataset designed for the development of fully bilingual Automatic Speech Recognition (ASR) systems for Basque and Spanish. The dataset is drawn from an...
ver más
|
|
|
|
|
|
|
Kyungho Yu, Hyoungju Kim, Jeongin Kim, Chanjun Chun and Pankoo Kim
Text-to-image technology enables computers to create images from text by simulating the human process of forming mental images. GAN-based text-to-image technology involves extracting features from input text; subsequently, they are combined with noise an...
ver más
|
|
|
|
|
|
|
Samuel R. Schrader and Eren Gultepe
The evaluation of similarities between natural languages often relies on prior knowledge of the languages being studied. We describe three methods for building phylogenetic trees and clustering languages without the use of language-specific information. ...
ver más
|
|
|
|
|
|
|
Saida Mussakhojayeva, Kaisar Dauletbek, Rustem Yeshpanov and Huseyin Atakan Varol
The primary aim of this study was to contribute to the development of multilingual automatic speech recognition for lower-resourced Turkic languages. Ten languages?Azerbaijani, Bashkir, Chuvash, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Uyghur, and Uzbek?we...
ver más
|
|
|
|
|
|
|
Oihana Mitxelena-Hoyos and José-Lázaro Amaro-Mellado
Toponymy, a transversal discipline for geography, linguistics, and history, finds one of its main supports in cartography. Due to exhaustiveness on the territory, cadastral cartography and its toponymy have the ideal characteristics to develop systematic...
ver más
|
|
|
|
|
|
|
Isabella Gagliardi and Maria Teresa Artese
When integrating data from different sources, there are problems of synonymy, different languages, and concepts of different granularity. This paper proposes a simple yet effective approach to evaluate the semantic similarity of short texts, especially k...
ver más
|
|
|
|