Inicio  /  Algorithms  /  Vol: 16 Par: 6 (2023)  /  Artículo
ARTÍCULO
TITULO

DrugFinder: Druggable Protein Identification Model Based on Pre-Trained Models and Evolutionary Information

Mu Zhang    
Fengqiang Wan and Taigang Liu    

Resumen

The identification of druggable proteins has always been the core of drug development. Traditional structure-based identification methods are time-consuming and costly. As a result, more and more researchers have shifted their attention to sequence-based methods for identifying druggable proteins. We propose a sequence-based druggable protein identification model called DrugFinder. The model extracts the features from the embedding output of the pre-trained protein model Prot_T5_Xl_Uniref50 (T5) and the evolutionary information of the position-specific scoring matrix (PSSM). Afterwards, to remove redundant features and improve model performance, we used the random forest (RF) method to select features, and the selected features were trained and tested on multiple different machine learning classifiers, including support vector machines (SVM), RF, naive Bayes (NB), extreme gradient boosting (XGB), and k-nearest neighbors (KNN). Among these classifiers, the XGB model achieved the best results. DrugFinder reached an accuracy of 94.98%, sensitivity of 96.33% and specificity of 96.83% on the independent test set, which is much better than the results from existing identification methods. Our model also performed well on another additional test set related to tumors, achieving an accuracy of 88.71% and precision of 93.72%. This further demonstrates the strong generalization capability of the model.

 Artículos similares

       
 
Beibei Mao, Hua Yang, Fei Sun, Ying Zhang and Xinrui Zhang    
Multi-scale coherent structures have been observed in ocean currents, which are induced by the interaction of shear flows with different velocities. Understanding the spatial configuration and scale characteristics of coherent structures will promote the... ver más

 
Bochen Duan, Shengping Wang, Changlong Luo and Zhigao Chen    
In recent years, the surge in marine activities has increased the frequency of submarine pipeline failures. Detecting and identifying the buried conditions of submarine pipelines has become critical. Sub-bottom profilers (SBPs) are widely employed for pi... ver más

 
Fan Zhu, Meng Zhang, Fuxuan Ma, Zhihua Li and Xianqiang Qu    
Wind turbine towers experience complex dynamic loads during actual operation, and these loads are difficult to accurately predict in advance, which may lead to inaccurate structural fatigue and strength assessment during the structural design phase, ther... ver más

 
Péter Bauer and Mihály Nagy    
Research and industrial application can require custom high-level controllers for industrial drones. Thus, this paper presents the high-fidelity dynamic and control model identification of the DJI M600 Pro hexacopter. This is a widely used multicopter in... ver más
Revista: Aerospace

 
Min Hu, Fan Zhang and Huiming Wu    
Various abnormal scenarios might occur during the shield tunneling process, which have an impact on construction efficiency and safety. Existing research on shield tunneling construction anomaly detection typically designs models based on the characteris... ver más
Revista: Applied Sciences