Risk-Sensitive Policy with Distributional Reinforcement Learning

Thibaut Th�ate and Damien Ernst

Resumen

Classical reinforcement learning (RL) techniques are generally concerned with the design of decision-making policies driven by the maximisation of the expected outcome. Nevertheless, this approach does not take into consideration the potential risk associated with the actions taken, which may be critical in certain applications. To address that issue, the present research work introduces a novel methodology based on distributional RL to derive sequential decision-making policies that are sensitive to the risk, the latter being modelled by the tail of the return probability distribution. The core idea is to replace the Q function generally standing at the core of learning schemes in RL by another function, taking into account both the expected return and the risk. Named the risk-based utility function U, it can be extracted from the random return distribution Z naturally learnt by any distributional RL algorithm. This enables the spanning of the complete potential trade-off between risk minimisation and expected return maximisation, in contrast to fully risk-averse methodologies. Fundamentally, this research yields a truly practical and accessible solution for learning risk-sensitive policies with minimal modification to the distributional RL algorithm, with an emphasis on the interpretability of the resulting decision-making process.

Palabras claves

distributional reinforcement learning - sequential decision-making - risk-sensitive policy - risk management - deep neural network

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 16 Parte: 7 (2023)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Algorithms
Applied Sciences
Journal of Marine Science and Engineering

DOI

https://doi.org/10.3390/a16070325

Art�culos similares

Computer Vision-Based Inspection System for Worker Training in Build and Construction Industry

Acceso

M. Fikret Ercan and Ricky Ben Wang

Recently computer vision has been applied in various fields of engineering successfully ranging from manufacturing to autonomous cars. A key player in this development is the achievements of the latest object detection and classification architectures. I... ver m�s

Revista: Computers

A Multi-UCAV Cooperative Decision-Making Method Based on an MAPPO Algorithm for Beyond-Visual-Range Air Combat

Acceso

Xiaoxiong Liu, Yi Yin, Yuzhan Su and Ruichen Ming

To solve the problems of autonomous decision making and the cooperative operation of multiple unmanned combat aerial vehicles (UCAVs) in beyond-visual-range air combat, this paper proposes an air combat decision-making method that is based on a multi-age... ver m�s

Revista: Aerospace

The Gradient-Boosting Method for Tackling High Computing Demand in Underwater Acoustic Propagation Modeling

Acceso

Dominic Lagrois, Tyler R. Bonnell, Ankita Shukla and Cl�ment Chion

Agent-based models return spatiotemporal information used to process time series of specific parameters for specific individuals called ?agents?. For complex, advanced and detailed models, this typically comes at the expense of high computing times and r... ver m�s

Revista: Journal of Marine Science and Engineering

A Survey of Forex and Stock Price Prediction Using Deep Learning

Acceso

Zexin Hu, Yiqi Zhao and Matloob Khushi

Predictions of stock and foreign exchange (Forex) have always been a hot and profitable area of study. Deep learning applications have been proven to yield better accuracy and return in the field of financial prediction and forecasting. In this survey, w... ver m�s

Revista: Applied System Innovation

A Hybrid MultiLayer Perceptron Under-Sampling with Bagging Dealing with a Real-Life Imbalanced Rice Dataset

Acceso

Moussa Diallo, Shengwu Xiong, Eshete Derb Emiru, Awet Fesseha, Aminu Onimisi Abdulsalami and Mohamed Abd Elaziz

Classification algorithms have shown exceptional prediction results in the supervised learning area. These classification algorithms are not always efficient when it comes to real-life datasets due to class distributions. As a result, datasets for real-l... ver m�s

Revista: Information

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles