REVISTA
Information

TODAS

Redirigiendo al acceso original de articulo en 18 segundos...

Inicio / Information / Vol: 15 Par: 2 (2024) / Art�culo

ART�CULO

TITULO

Understanding Self-Supervised Learning of Speech Representation via Invariance and Redundancy Reduction

Yusuf Brima

Ulf Krumnack

Simone Pika and Gunther Heidemann

Resumen

Self-supervised learning (SSL) has emerged as a promising paradigm for learning flexible speech representations from unlabeled data. By designing pretext tasks that exploit statistical regularities, SSL models can capture useful representations that are transferable to downstream tasks. Barlow Twins (BTs) is an SSL technique inspired by theories of redundancy reduction in human perception. In downstream tasks, BTs representations accelerate learning and transfer this learning across applications. This study applies BTs to speech data and evaluates the obtained representations on several downstream tasks, showing the applicability of the approach. However, limitations exist in disentangling key explanatory factors, with redundancy reduction and invariance alone being insufficient for factorization of learned latents into modular, compact, and informative codes. Our ablation study isolated gains from invariance constraints, but the gains were context-dependent. Overall, this work substantiates the potential of Barlow Twins for sample-efficient speech encoding. However, challenges remain in achieving fully hierarchical representations. The analysis methodology and insights presented in this paper pave a path for extensions incorporating further inductive priors and perceptual principles to further enhance the BTs self-supervision framework.

Palabras claves

acoustic analysis - Barlow Twins - self-supervised learning - invariance - redundancy reduction - speech representation learning

Acceso

P�GINAS

pp. 0 - 0

N�MERO

Volumen: 15 Parte: 2 (2024)

MATERIAS

INGENIER�A Y CONSTRUCCI�N CIVIL
TECNOLOG�A

REVISTAS SIMILARES

Applied Sciences
Algorithms
AI

DOI

https://doi.org/10.3390/info15020114

Art�culos similares

Chinese Cyberbullying Detection Using XLNet and Deep Bi-LSTM Hybrid Model

Acceso

Shifeng Chen, Jialin Wang and Ketai He

The popularization of the internet and the widespread use of smartphones have led to a rapid growth in the number of social media users. While information technology has brought convenience to people, it has also given rise to cyberbullying, which has a ... ver m�s

Revista: Information

There Are Infinite Ways to Formulate Code: How to Mitigate the Resulting Problems for Better Software Vulnerability Detection

Acceso

Jinghua Groppe, Sven Groppe, Daniel Senf and Ralf M�ller

Given a set of software programs, each being labeled either as vulnerable or benign, deep learning technology can be used to automatically build a software vulnerability detector. A challenge in this context is that there are countless equivalent ways to... ver m�s

Revista: Information

Audio-Based Emotion Recognition Using Self-Supervised Learning on an Engineered Feature Space

Acceso

Peranut Nimitsurachat and Peter Washington

Emotion recognition models using audio input data can enable the development of interactive systems with applications in mental healthcare, marketing, gaming, and social media analysis. While the field of affective computing using audio data is rich, a m... ver m�s

Revista: AI

Multi-Task Mean Teacher Medical Image Segmentation Based on Swin Transformer

Acceso

Jie Zhang, Fan Li, Xin Zhang, Yue Cheng and Xinhong Hei

As a crucial task for disease diagnosis, existing semi-supervised segmentation approaches process labeled and unlabeled data separately, ignoring the relationships between them, thereby limiting further performance improvements. In this work, we introduc... ver m�s

Revista: Applied Sciences

Detecting Moral Features in TV Series with a Transformer Architecture through Dictionary-Based Word Embedding

Acceso

Paolo Fantozzi, Valentina Rotondi, Matteo Rizzolli, Paola Dalla Torre and Maurizio Naldi

Moral features are essential components of TV series, helping the audience to engage with the story, exploring themes beyond sheer entertainment, reflecting current social issues, and leaving a long-lasting impact on the viewers. Their presence shows thr... ver m�s

Revista: Information

Revistas destacadas

Acceso directo a los n�meros publicados en la revista Infrastructures

Infrastructures

Acceso directo a los n�meros publicados en la revista Informed Infraestructure

Informed Infraestructure

Acceso directo a los n�meros publicados en la revista BiT

Acceso directo a los n�meros publicados en la revista Revista de la Construcci�n

Revista de la Construcci�n

Ver todas las revistas disponibles