Inicio  /  Cancers  /  Vol: 16 Par: 7 (2024)  /  Artículo
ARTÍCULO
TITULO

Identification of Interpretable Clusters and Associated Signatures in Breast Cancer Single-Cell Data: A Topic Modeling Approach

Gabriele Malagoli    
Filippo Valle    
Emmanuel Barillot    
Michele Caselle and Loredana Martignetti    

Resumen

Topic modeling, widely used in natural language processing, categorizes text documents into themes based on word frequency analysis. It has found success in various biological data analyses, including the accurate prediction of cancer subtypes and the simultaneous identification of genes, enhancers, and cell types from sparse single-cell data. Our study introduces a novel topic modeling approach for clustering single cells and detecting gene signatures in multi-omics single-cell datasets. Applied to study transcriptional heterogeneity in breast cancer cells resistant to chemotherapy and targeted therapy, it identifies protein-coding genes and long non-coding RNAs grouping cells into biologically similar clusters, effectively distinguishing between drug-sensitive and -resistant cancer types. Previous studies have interrogated long non-coding RNA (lncRNA) expression in single-cell data within breast cancer subtypes. Yet, the combined analysis of both lncRNA and mRNA expression in a cell type-specific manner remains to be explored. Compared to standard clustering methods, our approach offers a simultaneous optimal partitioning of genes and cells into topics and clusters, yielding easily interpretable results. Integrating mRNA and lncRNA data enhances cell classification accuracy.

PÁGINAS
pp. 0 - 0
REVISTAS SIMILARES

 Artículos similares