Redirigiendo al acceso original de articulo en 23 segundos...
Inicio  /  Algorithms  /  Vol: 15 Par: 7 (2022)  /  Artículo
ARTÍCULO
TITULO

Inference Acceleration with Adaptive Distributed DNN Partition over Dynamic Video Stream

Jin Cao    
Bo Li    
Mengni Fan and Huiyu Liu    

Resumen

Deep neural network-based computer vision applications have exploded and are widely used in intelligent services for IoT devices. Due to the computationally intensive nature of DNNs, the deployment and execution of intelligent applications in smart scenarios face the challenge of limited device resources. Existing job scheduling strategies are single-focused and have limited support for large-scale end-device scenarios. In this paper, we present ADDP, an adaptive distributed DNN partition method that supports video analysis on large-scale smart cameras. ADDP applies to the commonly used DNN models for computer vision and contains a feature-map layer partition module (FLP) supporting edge-to-end collaborative model partition and a feature-map size partition (FSP) module supporting multidevice parallel inference. Based on the inference delay minimization objective, FLP and FSP achieve a tradeoff between the arithmetic and communication resources of different devices. We validate ADDP on heterogeneous devices and show that both the FLP module and the FSP module outperform existing approaches and reduce single-frame response latency by 10?25% compared to the pure on-device processing.

 Artículos similares

       
 
Tommaso Zanotti, Francesco Maria Puglisi and Paolo Pavan    
Different in-memory computing paradigms enabled by emerging non-volatile memory technologies are promising solutions for the development of ultra-low-power hardware for edge computing. Among these, SIMPLY, a smart logic-in-memory architecture, provides h... ver más