Inicio  /  Information  /  Vol: 11 Par: 4 (2020)  /  Artículo
ARTÍCULO
TITULO

Investigation of Spoken-Language Detection and Classification in Broadcasted Audio Content

Rigas Kotsakis    
Maria Matsiola    
George Kalliris and Charalampos Dimoulas    

Resumen

The current paper focuses on the investigation of spoken-language classification in audio broadcasting content. The approach reflects a real-word scenario, encountered in modern media/monitoring organizations, where semi-automated indexing/documentation is deployed, which could be facilitated by the proposed language detection preprocessing. Multilingual audio recordings of specific radio streams are formed into a small dataset, which is used for the adaptive classification experiments, without seeking?at this step?for a generic language recognition model. Specifically, hierarchical discrimination schemes are followed to separate voice signals before classifying the spoken languages. Supervised and unsupervised machine learning is utilized at various windowing configurations to test the validity of our hypothesis. Besides the analysis of the achieved recognition scores (partial and overall), late integration models are proposed for semi-automatically annotation of new audio recordings. Hence, data augmentation mechanisms are offered, aiming at gradually formulating a Generic Audio Language Classification Repository. This database constitutes a program-adaptive collection that, beside the self-indexing metadata mechanisms, could facilitate generic language classification models in the future, through state-of-art techniques like deep learning. This approach matches the investigatory inception of the project, which seeks for indicators that could be applied in a second step with a larger dataset and/or an already pre-trained model, with the purpose to deliver overall results.

 Artículos similares

       
 
Walaa H. Elashmawi, John Emad, Ahmed Serag, Karim Khaled, Ahmed Yehia, Karim Mohamed, Hager Sobeah and Ahmed Ali    
New guitarists face multiple problems when first starting out, and these mainly stem from a flood of information that they are presented with. Students also typically struggle with proper pitch frequency recognition and accurate left-hand motion. A varie... ver más
Revista: Applied Sciences

 
Fahad M. Alotaibi    
Machine learning frameworks categorizing customer reviews on online products have significantly improved sales and product quality for major manufacturers. Manually scrutinizing extensive customer reviews is imprecise and time-consuming. Current product ... ver más
Revista: Applied Sciences

 
Theodoros Psallidas and Evaggelos Spyrou    
During the last few years, several technological advances have led to an increase in the creation and consumption of audiovisual multimedia content. Users are overexposed to videos via several social media or video sharing websites and mobile phone appli... ver más
Revista: Computers

 
Driss Khalil, Amrutha Prasad, Petr Motlicek, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Srikanth Madikeri and Christof Schuepbach    
In air traffic management (ATM), voice communications are critical for ensuring the safe and efficient operation of aircraft. The pertinent voice communications?air traffic controller (ATCo) and pilot?are usually transmitted in a single channel, which po... ver más
Revista: Aerospace

 
Jiachen Zhang, Guoqing Tu, Shubo Liu and Zhaohui Cai    
The rapid development of speech synthesis technology has significantly improved the naturalness and human-likeness of synthetic speech. As the technical barriers for speech synthesis are rapidly lowering, the number of illegal activities such as fraud an... ver más
Revista: Algorithms