REVISTA
Algorithms

TODAS

Inicio / Algorithms / Vol: 17 Par: 1 (2024) / Art�culo

ART�CULO

TITULO

Optimizing Reinforcement Learning Using a Generative Action-Translator Transformer

Jiaming Li

Ning Xie and Tingting Zhao

Resumen

In recent years, with the rapid advancements in Natural Language Processing (NLP) technologies, large models have become widespread. Traditional reinforcement learning algorithms have also started experimenting with language models to optimize training. However, they still fundamentally rely on the Markov Decision Process (MDP) for reinforcement learning, and do not fully exploit the advantages of language models for dealing with long sequences of problems. The Decision Transformer (DT) introduced in 2021 is the initial effort to completely transform the reinforcement learning problem into a challenge within the NLP domain. It attempts to use text generation techniques to create reinforcement learning trajectories, addressing the issue of finding optimal trajectories. However, the article places the training trajectory data of reinforcement learning directly into a basic language model for training. Its aim is to predict the entire trajectory, encompassing state and reward information. This approach deviates from the reinforcement learning training objective of finding the optimal action. Furthermore, it generates redundant information in the output, impacting the final training effectiveness of the agent. This paper proposes a more reasonable network model structure, the Action-Translator Transformer (ATT), to predict only the next action of the agent. This makes the language model more interpretable for the reinforcement learning problem. We test our model in simulated gaming scenarios and compare it with current mainstream methods in the offline reinforcement learning field. Based on the presented experimental results, our model demonstrates superior performance. We hope that introducing this model will inspire new ideas and solutions for combining language models and reinforcement learning, providing fresh perspectives for offline reinforcement learning research.

Palabras claves

machine learning - reinforcement learning - Transformer - action prediction - Action-Translator Transformer

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 17 Parte: 1 (2024)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
Algorithms
Journal of Marine Science and Engineering

DOI

https://doi.org/10.3390/a17010037

Art�culos similares

Multi-Path Routing Algorithm Based on Deep Reinforcement Learning for SDN

Acceso

Yi Zhang, Lanxin Qiu, Yangzhou Xu, Xinjia Wang, Shengjie Wang, Agyemang Paul and Zhefu Wu

Software-Defined Networking (SDN) enhances network control but faces Distributed Denial of Service (DDoS) attacks due to centralized control and flow-table constraints in network devices. To overcome this limitation, we introduce a multi-path routing alg... ver m�s

Revista: Applied Sciences

An Improved Method for Optimizing CNC Laser Cutting Paths for Ship Hull Components with Thicknesses up to 24 mm

Acceso

Xuan Liu and Daofang Chang

In this paper, the essence and optimization objectives of the hull parts path optimization problem of CNC laser cutting are described, and the shortcomings of the existing optimization methods are pointed out. Based on the optimization problem of the hul... ver m�s

Revista: Journal of Marine Science and Engineering

Life-Cycle Cost Assessment Using the Power Spectral Density Function in a Coastal Concrete Bridge

Acceso

Mehrdad Hadizadeh-Bazaz, Ignacio J. Navarro and V�ctor Yepes

Recently, the repair and maintenance of structures has been necessary to prevent these structures? sudden collapse and to prevent human and financial damage. A natural factor in marine environments that destroys structures and reduces their life is the p... ver m�s

Revista: Journal of Marine Science and Engineering

An Evaluation of Treatment Effectiveness for Reclaimed Coral Sand Foundation in the South China Sea

Acceso

Ting Yao and Wei Li

Mega land reclamation projects have been carried out on the coral reefs in the South China Sea. Coral sand was used as a backfill material through hydraulic filling, with fill heights ranging from 6 to 10 m. To enhance foundation stability, vibro-flotati... ver m�s

Revista: Journal of Marine Science and Engineering

Fast Path Planning for Long-Range Planetary Roving Based on a Hierarchical Framework and Deep Reinforcement Learning

Acceso

Ruijun Hu and Yulin Zhang

The global path planning of planetary surface rovers is crucial for optimizing exploration benefits and system safety. For the cases of long-range roving or obstacle constraints that are time-varied, there is an urgent need to improve the computational e... ver m�s

Revista: Aerospace

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles