ARTÍCULO
TITULO

Application of sinusoidal speech modeling to the sound diarization problem

Bulat Nutfullin    
Eugene Ilyushin    

Resumen

Speech is a specific feature of human and his advantage over other species within evolution. Sound diarization is a process of sound separation, taking into account belonging to the speaker. Before the advent of deep learning and the availability of the necessary computing resources, the quality of the algorithms that determine the speaker by voice left much to be desired. Diarization has numerous applications: smart speakers, mobile phones, automatic speech translation systems. But it should be noted that the existing diarization algorithms have drawbacks, for example, the complexity of work with simultaneous speech by several speakers or the lack of diarization results for its automatic application in some areas. This explains the relevance of research in this area. The sinusoidal model is an algorithm for tracking sequences of points in timeamplitudefrequency space. In existing researches, it is applied to simulations of echolocation, human speech, and speech synthesis. At the time of the study, no applications of the sinusoidal model in the problem of diarization were found in the literature. The paper considers the problem of diarization and the main quality indicators used in assessing the solutions to this problem. The main intermediate representations of sound used in existing solutions are considered, and a diarization algorithm using sinusoidal speech modeling is proposed. The advantage of the proposed algorithm is the ability to operate sinusoidal representations as VAD, which in general made it possible to make the used diarization algorithm more efficient.

 Artículos similares

       
 
Sebastian Schlaweck, Claus Juergen Bauer, Friederike Schmitz, Peter Brossart, Tobias A. W. Holderried and Valentin Sebastian Schäfer    
Sinusoidal obstruction syndrome (SOS) is a rare complication after allogeneic hematopoietic stem cell transplantation (alloHSCT) caused by endothelial dysfunction. Previous definitions and diagnostic criteria for the presence of SOS include bilirubinemia... ver más
Revista: Applied Sciences

 
Vincenzo Stornelli, Gianluca Barile, Leonardo Pantoli, Massimo Scarsella, Giuseppe Ferri, Francesco Centurelli, Pasquale Tommasino and Alessandro Trifiletti    
The aim of this paper is to prove that, through a canonic approach, sinusoidal oscillators based on second-generation voltage conveyor (VCII) can be implemented. The investigation demonstrates the feasibility of the design results in a pair of new canoni... ver más

 
Bulat Nutfullin,Eugene Ilyushin     Pág. 14 - 20
Speech is a specific feature of human and his advantage over other species within evolution. Sound diarization is a process of sound separation, taking into account belonging to the speaker. Before the advent of deep learning and the ... ver más

 
Timothy Sands    
The major premise of deterministic artificial intelligence (D.A.I.) is to assert deterministic self-awareness statements based in either the physics of the underlying problem or system identification to establish governing differential equations. The key... ver más

 
Qiuying Chen and Hongwei Mo    
Autonomous navigation in unknown environments is still a challenge for robotics. Many efforts have been exerted to develop truly autonomous goal-oriented robot navigation models based on the neural mechanism of spatial cognition and mapping in animals? b... ver más
Revista: Applied Sciences