Redirigiendo al acceso original de articulo en 24 segundos...
Inicio  /  Algorithms  /  Vol: 16 Par: 10 (2023)  /  Artículo
ARTÍCULO
TITULO

Deep Neural Networks Training by Stochastic Quasi-Newton Trust-Region Methods

Mahsa Yousefi and Ángeles Martínez    

Resumen

While first-order methods are popular for solving optimization problems arising in deep learning, they come with some acute deficiencies. To overcome these shortcomings, there has been recent interest in introducing second-order information through quasi-Newton methods that are able to construct Hessian approximations using only gradient information. In this work, we study the performance of stochastic quasi-Newton algorithms for training deep neural networks. We consider two well-known quasi-Newton updates, the limited-memory Broyden?Fletcher?Goldfarb?Shanno (BFGS) and the symmetric rank one (SR1). This study fills a gap concerning the real performance of both updates in the minibatch setting and analyzes whether more efficient training can be obtained when using the more robust BFGS update or the cheaper SR1 formula, which?allowing for indefinite Hessian approximations?can potentially help to better navigate the pathological saddle points present in the non-convex loss functions found in deep learning. We present and discuss the results of an extensive experimental study that includes many aspects affecting performance, like batch normalization, the network architecture, the limited memory parameter or the batch size. Our results show that stochastic quasi-Newton algorithms are efficient and, in some instances, able to outperform the well-known first-order Adam optimizer, run with the optimal combination of its numerous hyperparameters, and the stochastic second-order trust-region STORM algorithm.

 Artículos similares

       
 
Shubin Wang, Yuanyuan Chen and Zhang Yi    
Diabetic retinopathy is a prevalent eye disease that poses a potential risk of blindness. Nevertheless, due to the small size of diabetic retinopathy lesions and the high interclass similarity in terms of location, color, and shape among different lesion... ver más
Revista: Applied Sciences

 
Xiaojiao Gu, Yang Tian, Chi Li, Yonghe Wei and Dashuai Li    
The fault diagnosis method proposed in this paper can be applied to the diagnosis of bearings in machine tool spindle systems.
Revista: Applied Sciences

 
Zengyu Cai, Chunchen Tan, Jianwei Zhang, Liang Zhu and Yuan Feng    
As network technology continues to develop, the popularity of various intelligent terminals has accelerated, leading to a rapid growth in the scale of wireless network traffic. This growth has resulted in significant pressure on resource consumption and ... ver más
Revista: Applied Sciences

 
Jin-Woo Kong, Byoung-Doo Oh, Chulho Kim and Yu-Seop Kim    
Intracerebral hemorrhage (ICH) is a severe cerebrovascular disorder that poses a life-threatening risk, necessitating swift diagnosis and treatment. While CT scans are the most effective diagnostic tool for detecting cerebral hemorrhage, their interpreta... ver más
Revista: Applied Sciences

 
Tianhao Gao, Meng Zhang, Yifan Zhu, Youjian Zhang, Xiangsheng Pang, Jing Ying and Wenming Liu    
Classifying sports videos is complex due to their dynamic nature. Traditional methods, like optical flow and the Histogram of Oriented Gradient (HOG), are limited by their need for expertise and lack of universality. Deep learning, particularly Convoluti... ver más
Revista: Applied Sciences