REVISTA
AI

   
Inicio  /  AI  /  Vol: 4 Par: 4 (2023)  /  Artículo
ARTÍCULO
TITULO

Deep Learning Performance Characterization on GPUs for Various Quantization Frameworks

Muhammad Ali Shafique    
Arslan Munir and Joonho Kong    

Resumen

Deep learning is employed in many applications, such as computer vision, natural language processing, robotics, and recommender systems. Large and complex neural networks lead to high accuracy; however, they adversely affect many aspects of deep learning performance, such as training time, latency, throughput, energy consumption, and memory usage in the training and inference stages. To solve these challenges, various optimization techniques and frameworks have been developed for the efficient performance of deep learning models in the training and inference stages. Although optimization techniques such as quantization have been studied thoroughly in the past, less work has been done to study the performance of frameworks that provide quantization techniques. In this paper, we have used different performance metrics to study the performance of various quantization frameworks, including TensorFlow automatic mixed precision and TensorRT. These performance metrics include training time and memory utilization in the training stage along with latency and throughput for graphics processing units (GPUs) in the inference stage. We have applied the automatic mixed precision (AMP) technique during the training stage using the TensorFlow framework, while for inference we have utilized the TensorRT framework for the post-training quantization technique using the TensorFlow TensorRT (TF-TRT) application programming interface (API).We performed model profiling for different deep learning models, datasets, image sizes, and batch sizes for both the training and inference stages, the results of which can help developers and researchers to devise and deploy efficient deep learning models for GPUs.

 Artículos similares

       
 
Seokjoon Kwon, Jae-Hyeon Park, Hee-Deok Jang, Hyunwoo Nam and Dong Eui Chang    
Deep learning algorithms are widely used for pattern recognition in electronic noses, which are sensor arrays for gas mixtures. One of the challenges of using electronic noses is sensor drift, which can degrade the accuracy of the system over time, even ... ver más
Revista: Applied Sciences

 
Alberto Alvarellos, Andrés Figuero, Santiago Rodríguez-Yáñez, José Sande, Enrique Peña, Paulo Rosa-Santos and Juan Rabuñal    
Port managers can use predictions of the wave overtopping predictors created in this work to take preventative measures and optimize operations, ultimately improving safety and helping to minimize the economic impact that overtopping events have on the p... ver más
Revista: Applied Sciences

 
Shihao Ma, Jiao Wu, Zhijun Zhang and Yala Tong    
Addressing the limitations, including low automation, slow recognition speed, and limited universality, of current mudslide disaster detection techniques in remote sensing imagery, this study employs deep learning methods for enhanced mudslide disaster d... ver más
Revista: Applied Sciences

 
Ryota Higashimoto, Soh Yoshida and Mitsuji Muneyasu    
This paper addresses the performance degradation of deep neural networks caused by learning with noisy labels. Recent research on this topic has exploited the memorization effect: networks fit data with clean labels during the early stages of learning an... ver más
Revista: Applied Sciences

 
Giorgio Lazzarinetti, Riccardo Dondi, Sara Manzoni and Italo Zoppis    
Solving combinatorial problems on complex networks represents a primary issue which, on a large scale, requires the use of heuristics and approximate algorithms. Recently, neural methods have been proposed in this context to find feasible solutions for r... ver más
Revista: Algorithms