Three-Stage Deep Learning Framework for Video Surveillance

Ji-Woon Lee and Hyun-Soo Kang

Resumen

The escalating use of security cameras has resulted in a surge in images requiring analysis, a task hindered by the inefficiency and error-prone nature of manual monitoring. In response, this study delves into the domain of anomaly detection in CCTV security footage, addressing challenges previously encountered in analyzing videos with complex or dynamic backgrounds and long sequences. We introduce a three-stage deep learning architecture designed to detect abnormalities in security camera videos. The first stage employs a pre-trained convolutional neural network to extract features from individual video frames. Subsequently, these features are transformed into time series data in the second stage, utilizing a blend of bidirectional long short-term memory and multi-head attention to analyze short-term frame relationships. The final stage leverages relative positional embeddings and a custom Transformer encoder to interpret long-range frame relationships and identify anomalies. Tested on various open datasets, particularly those with complex backgrounds and extended sequences, our method demonstrates enhanced accuracy and efficiency in video analysis. This approach not only improves current security camera analysis but also shows potential for diverse application settings, signifying a significant advancement in the evolution of security camera monitoring and analysis technologies.

Palabras claves

deep learning - Transformers - video surveillance - anomaly detection - RNN

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 14 Parte: 1 (2024)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Water
Journal of Marine Science and Engineering
AI

DOI