Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications

Fatima Zahra Guerrouj

Sergio Rodr�guez Fl�rez

Mohamed Abouzahir

Abdelhafid El Ouardi and Mustapha Ramzi

Resumen

Convolutional Neural Networks (CNNs) have been incredibly effective for object detection tasks. YOLOv4 is a state-of-the-art object detection algorithm designed for embedded systems. It is based on YOLOv3 and has improved accuracy, speed, and robustness. However, deploying CNNs on embedded systems such as Field Programmable Gate Arrays (FPGAs) is difficult due to their limited resources. To address this issue, FPGA-based CNN architectures have been developed to improve the resource utilization of CNNs, resulting in improved accuracy and speed. This paper examines the use of General Matrix Multiplication Operations (GEMM) to accelerate the execution of YOLOv4 on embedded systems. It reviews the most recent GEMM implementations and evaluates their accuracy and robustness. It also discusses the challenges of deploying YOLOv4 on autonomous vehicle datasets. Finally, the paper presents a case study demonstrating the successful implementation of YOLOv4 on an Intel Arria 10 embedded system using GEMM.

Palabras claves

YOLOv4 - GEMM - FPGA - autonomous driving

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 13 Parte: 2 (2023)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

DOI

https://doi.org/10.3390/jlpea13020040

Efficient GEMM Implementation for Vision-Based Object Detection in Autonomous Driving Applications

Revistas destacadas