REVISTA
Information

TODAS

Inicio / Information / Vol: 15 Par: 2 (2024) / Art�culo

ART�CULO

TITULO

SRBerta?A Transformer Language Model for Serbian Cyrillic Legal Texts

Milo? Bogdanovic

Jelena Kocic and Leonid Stoimenov

Resumen

Language is a unique ability of human beings. Although relatively simple for humans, the ability to understand human language is a highly complex task for machines. For a machine to learn a particular language, it must understand not only the words and rules used in a particular language, but also the context of sentences and the meaning that words take on in a particular context. In the experimental development we present in this paper, the goal was the development of the language model SRBerta?a language model designed to understand the formal language of Serbian legal documents. SRBerta is the first of its kind since it has been trained using Cyrillic legal texts contained within a dataset created specifically for this purpose. The main goal of SRBerta network development was to understand the formal language of Serbian legislation. The training process was carried out using minimal resources (single NVIDIA Quadro RTX 5000 GPU) and performed in two phases?base model training and fine-tuning. We will present the structure of the model, the structure of the training datasets, the training process, and the evaluation results. Further, we will explain the accuracy metric used in our case and demonstrate that SRBerta achieves a high level of accuracy for the task of masked language modeling in Serbian Cyrillic legal texts. Finally, SRBerta model and training datasets are publicly available for scientific and commercial purposes.

Palabras claves

large language model - legislation - Serbian - BERT

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 15 Parte: 2 (2024)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Information
Algorithms
Applied System Innovation

DOI

https://doi.org/10.3390/info15020074

Art�culos similares

From Data to Human-Readable Requirements: Advancing Requirements Elicitation through Language-Transformer-Enhanced Opportunity Mining

Acceso

Pascal Harth, Orlando J�hde, Sophia Schneider, Nils Horn and R�diger Buchkremer

In this research, we present an algorithm that leverages language-transformer technologies to automate the generation of product requirements, utilizing E-Shop consumer reviews as a data source. Our methodology combines classical natural language process... ver m�s

Revista: Algorithms

A Review of Transformer-Based Approaches for Image Captioning

Acceso

Oscar Ondeng, Heywood Ouma and Peter Akuon

Visual understanding is a research area that bridges the gap between computer vision and natural language processing. Image captioning is a visual understanding task in which natural language descriptions of images are automatically generated using visio... ver m�s

Revista: Applied Sciences

A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts

Acceso

Fetoun Mansour AlZahrani and Maha Al-Yahya

Authorship attribution (AA) is a field of natural language processing that aims to attribute text to its author. Although the literature includes several studies on Arabic AA in general, applying AA to classical Arabic texts has not gained similar attent... ver m�s

Revista: Applied Sciences

Enhancing Fake News Detection in Romanian Using Transformer-Based Back Translation Augmentation

Acceso

Marian Bucos and Bogdan Dragulescu

Misinformation poses a significant challenge in the digital age, requiring robust methods to detect fake news. This study investigates the effectiveness of using Back Translation (BT) augmentation, specifically transformer-based models, to improve fake n... ver m�s

Revista: Applied Sciences

Contemporary Approaches in Evolving Language Models

Acceso

Dina Oralbekova, Orken Mamyrbayev, Mohamed Othman, Dinara Kassymova and Kuralai Mukhsina

This article provides a comprehensive survey of contemporary language modeling approaches within the realm of natural language processing (NLP) tasks. This paper conducts an analytical exploration of diverse methodologies employed in the creation of lang... ver m�s

Revista: Applied Sciences

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles