Inicio  /  Algorithms  /  Vol: 16 Par: 7 (2023)  /  Artículo
ARTÍCULO
TITULO

Audio Anti-Spoofing Based on Audio Feature Fusion

Jiachen Zhang    
Guoqing Tu    
Shubo Liu and Zhaohui Cai    

Resumen

The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness of synthetic speech. As the technical barriers for speech synthesis are rapidly lowering, the number of illegal activities such as fraud and extortion is increasing, posing a significant threat to authentication systems, such as automatic speaker verification. This paper proposes an end-to-end speech synthesis detection model based on audio feature fusion in response to the constantly evolving synthesis techniques and to improve the accuracy of detecting synthetic speech. The model uses a pre-trained wav2vec2 model to extract features from raw waveforms and utilizes an audio feature fusion module for back-end classification. The audio feature fusion module aims to improve the model accuracy by adequately utilizing the audio features extracted from the front end and fusing the information from timeframes and feature dimensions. Data augmentation techniques are also used to enhance the performance generalization of the model. The model is trained on the training and development sets of the logical access (LA) dataset of the ASVspoof 2019 Challenge, an international standard, and is tested on the logical access (LA) and deep-fake (DF) evaluation datasets of the ASVspoof 2021 Challenge. The equal error rate (EER) on ASVspoof 2021 LA and ASVspoof 2021 DF are 1.18% and 2.62%, respectively, achieving the best results on the DF dataset.

 Artículos similares

       
 
Jih-Ching Chiu, Guan-Yi Lee, Chih-Yang Hsieh and Qing-You Lin    
In computer vision and image processing, the shift from traditional cameras to emerging sensing tools, such as gesture recognition and object detection, addresses privacy concerns. This study navigates the Integrated Sensing and Communication (ISAC) era,... ver más

 
Maryam Omar, Hafeez Ur Rehman, Omar Bin Samin, Moutaz Alazab, Gianfranco Politano and Alfredo Benso    
Text-to-image synthesis is one of the most critical and challenging problems of generative modeling. It is of substantial importance in the area of automatic learning, especially for image creation, modification, analysis and optimization. A number of wo... ver más
Revista: Information

 
Bin Shen, Lingfei Xiao and Zhifeng Ye    
In order to solve the problem of full flight envelope control for aircraft engines, the design of a linear parameter-varying (LPV) controller is described in this paper. First, according to the nonlinear aerodynamic model of the aircraft engine, the LPV ... ver más
Revista: Aerospace

 
Fatma Guesmi, Naoufel Azouz and Jamel Neji    
This paper presents the design and mathematical model of an innovative smart crane, CHAYA-SC, based on the principle of a cable-driven parallel manipulator, as well as its stabilization. This crane is mounted on the airship hold and intended for handling... ver más
Revista: Aerospace

 
Luigi Gianpio Di Maggio, Eugenio Brusa and Cristiana Delprete    
The Intelligent Fault Diagnosis of rotating machinery calls for a substantial amount of training data, posing challenges in acquiring such data for damaged industrial machinery. This paper presents a novel approach for generating synthetic data using a G... ver más
Revista: Applied Sciences