Inicio  /  Applied Sciences  /  Vol: 8 Núm: 8 Par: August (2018)  /  Artículo
ARTÍCULO
TITULO

Deep Learning for Audio Event Detection and Tagging on Low-Resource Datasets

Veronica Morfi and Dan Stowell    

Resumen

In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.

 Artículos similares

       
 
Wandile Nhlapho, Marcellin Atemkeng, Yusuf Brima and Jean-Claude Ndogmo    
The advent of deep learning (DL) has revolutionized medical imaging, offering unprecedented avenues for accurate disease classification and diagnosis. DL models have shown remarkable promise for classifying brain tumors from Magnetic Resonance Imaging (M... ver más
Revista: Information

 
Maryan Rizinski, Andrej Jankov, Vignesh Sankaradas, Eugene Pinsky, Igor Mishkovski and Dimitar Trajanov    
The task of company classification is traditionally performed using established standards, such as the Global Industry Classification Standard (GICS). However, these approaches heavily rely on laborious manual efforts by domain experts, resulting in slow... ver más
Revista: Information

 
Mondher Bouazizi, Chuheng Zheng, Siyuan Yang and Tomoaki Ohtsuki    
A growing focus among scientists has been on researching the techniques of automatic detection of dementia that can be applied to the speech samples of individuals with dementia. Leveraging the rapid advancements in Deep Learning (DL) and Natural Languag... ver más
Revista: Information

 
Xie Lian, Xiaolong Hu, Liangsheng Shi, Jinhua Shao, Jiang Bian and Yuanlai Cui    
The parameters of the GR4J-CemaNeige coupling model (GR4neige) are typically treated as constants. However, the maximum capacity of the production store (parX1) exhibits time-varying characteristics due to climate variability and vegetation coverage chan... ver más
Revista: Water

 
Yongen Lin, Dagang Wang, Tao Jiang and Aiqing Kang    
Reliable streamflow forecasting is a determining factor for water resource planning and flood control. To better understand the strengths and weaknesses of newly proposed methods in streamflow forecasting and facilitate comparisons of different research ... ver más
Revista: Water