Inicio  /  Algorithms  /  Vol: 16 Par: 9 (2023)  /  Artículo
ARTÍCULO
TITULO

Optimized Tensor Decomposition and Principal Component Analysis Outperforming State-of-the-Art Methods When Analyzing Histone Modification Chromatin Immunoprecipitation Profiles

Turki Turki    
Sanjiban Sekhar Roy and Y.-H. Taguchi    

Resumen

It is difficult to identify histone modification from datasets that contain high-throughput sequencing data. Although multiple methods have been developed to identify histone modification, most of these methods are not specific to histone modification but are general methods that aim to identify protein binding to the genome. In this study, tensor decomposition (TD) and principal component analysis (PCA)-based unsupervised feature extraction with optimized standard deviation were successfully applied to gene expression and DNA methylation. The proposed method was used to identify histone modification. Histone modification along the genome is binned within the region of length L. Considering principal components (PCs) or singular value vectors (SVVs) that PCA or TD attributes to samples, we can select PCs or SVVs attributed to regions. The selected PCs and SVVs further attribute p-values to regions, and adjusted p-values are used to select regions. The proposed method identified various histone modifications successfully and outperformed various state-of-the-art methods. This method is expected to serve as a de facto standard method to identify histone modification. For reproducibility and to ensure the systematic analysis of our study is applicable to datasets from different gene expression experiments, we have made our tools publicly available for download from gitHub.

 Artículos similares