Inicio  /  Algorithms  /  Vol: 17 Par: 1 (2024)  /  Artículo
ARTÍCULO
TITULO

GPU Algorithms for Structured Sparse Matrix Multiplication with Diagonal Storage Schemes

Sardar Anisul Haque    
Mohammad Tanvir Parvez and Shahadat Hossain    

Resumen

Matrix?matrix multiplication is of singular importance in linear algebra operations with a multitude of applications in scientific and engineering computing. Data structures for storing matrix elements are designed to minimize overhead information as well as to optimize the operation count. In this study, we utilize the notion of the compact diagonal storage method (CDM), which builds upon the previously developed diagonal storage?an orientation-independent uniform scheme to store the nonzero elements of a range of matrices. This study exploits both these storage schemes and presents efficient GPU-accelerated parallel implementations of matrix multiplication when the input matrices are banded and/or structured sparse. We exploit the data layouts in the diagonal storage schemes to expose a substantial amount of fine-grained parallelism and effectively utilize the GPU shared memory to improve the locality of data access for numerical calculations. Results from an extensive set of numerical experiments with the aforementioned types of matrices demonstrate orders-of-magnitude speedups compared with the sequential performance.

 Artículos similares

       
 
Xiaoyu Han, Chenyu Li, Zifan Wang and Guohua Liu    
Neural architecture search (NAS) has shown great potential in discovering powerful and flexible network models, becoming an important branch of automatic machine learning (AutoML). Although search methods based on reinforcement learning and evolutionary ... ver más
Revista: Algorithms

 
Jia Hou, Jingyu Zhang, Qi Chen, Siwei Xiang, Yishuo Meng, Jianfei Wang, Cimang Lu and Chen Yang    
Artificial intelligence is changing and influencing our world. As one of the main algorithms in the field of artificial intelligence, convolutional neural networks (CNNs) have developed rapidly in recent years. Especially after the emergence of NASNet, C... ver más
Revista: Information

 
Pedro Torres, Hugo Marques and Paulo Marques    
This paper describes a real case implementation of an automatic pedestrian-detection solution, implemented in the city of Aveiro, Portugal, using affordable LiDAR technology and open, publicly available, pedestrian-detection frameworks based on machine-l... ver más
Revista: Computers

 
Rohit Chowdhury, Atharva Navsalkar and Deepak Subramani    
The importance of autonomous marine vehicles is increasing in a wide range of ocean science and engineering applications. Multi-objective optimization, where trade-offs between multiple conflicting objectives are achieved (such as minimizing expected mis... ver más

 
Chun-Liang Lee, Guan-Yu Lin and Yaw-Chung Chen    
Advanced network services, such as firewalls, policy-based routing, and virtual private networks, must rely on routers to classify packets into different flows based on packet headers and predefined filter tables. When multiple filters are overlapped, co... ver más
Revista: Algorithms