Exploring Prompts in Few-Shot Cross-Linguistic Topic Classification Scenarios

Zhipeng Zhang

Shengquan Liu and Jianming Cheng

Resumen

In recent years, large-scale pretrained language models have become widely used in natural language processing tasks. On this basis, prompt learning has achieved excellent performance in specific few-shot classification scenarios. The core idea of prompt learning is to convert a downstream task into a masked language modelling task. However, different prompt templates can greatly affect the results, and finding an appropriate template is difficult and time-consuming. To this end, this study proposes a novel hybrid prompt approach, which combines discrete prompts and continuous prompts, to motivate the model to learn more semantic knowledge from a small number of training samples. By comparing the performance difference between discrete prompts and continuous prompts, we find that hybrid prompts achieve the best results, reaching a 73.82% F1 value in the test set. In addition, we analyze the effect of different virtual token lengths in continuous prompts and hybrid prompts in a few-shot cross-language topic classification scenario. The results demonstrate that there is a threshold for the length of virtual tokens, and too many virtual tokens decrease the performance of the model. It is better not to exceed the average length of the training set corpus. Finally, this paper designs a method based on vector similarity to explore the real meanings represented by virtual tokens. The experimental results show that the prompt automatically learnt from the virtual token has a certain correlation with the input text.

Palabras claves

prompts - learning with fewer samples - cross language - topic classification

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 13 Parte: 17 (2023)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Information
Applied Sciences
Algorithms

DOI