Next Article in Journal
Assessment of Radiation Risk Perception and Interest in Tritiated Water among Returnees to and Evacuees from Tomioka Town within 20 km of the Fukushima Daiichi Nuclear Power Plant
Next Article in Special Issue
Comprehensible Machine-Learning-Based Models for the Pre-Emptive Diagnosis of Multiple Sclerosis Using Clinical Data: A Retrospective Study in the Eastern Province of Saudi Arabia
Previous Article in Journal
A Structural Validation of the Brief COPE Scale among Outpatients with Alcohol and Opioid Use Disorders
Previous Article in Special Issue
The Use of a Technology Acceptance Model (TAM) to Predict Patients’ Usage of a Personal Health Record System: The Role of Security, Privacy, and Usability
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Clinical Decision Support Systems to Predict Drug–Drug Interaction Using Multilabel Long Short-Term Memory with an Autoencoder

1
Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
2
Department of Information Systems, College of Computing and Information System, Umm Al-Qura University, Makkah 24211, Saudi Arabia
3
Department of Computer and Self Development, Preparatory Year Deanship, Prince Sattam bin Abdulaziz University, Al Kharj 16436, Saudi Arabia
4
Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia
5
Department of Computer Science, Faculty of Computers and Information Technology, Future University in Egypt, New Cairo 11835, Egypt
6
Department of Information System, College of Computer Engineering and Sciences, Prince Sattam bin Abdulaziz University, Al Kharj 16436, Saudi Arabia
*
Author to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2023, 20(3), 2696; https://doi.org/10.3390/ijerph20032696
Submission received: 21 October 2022 / Revised: 24 January 2023 / Accepted: 26 January 2023 / Published: 2 February 2023
(This article belongs to the Special Issue Applications of Artificial Intelligence to Health)

Abstract

:
Big Data analytics is a technique for researching huge and varied datasets and it is designed to uncover hidden patterns, trends, and correlations, and therefore, it can be applied for making superior decisions in healthcare. Drug–drug interactions (DDIs) are a main concern in drug discovery. The main role of precise forecasting of DDIs is to increase safety potential, particularly, in drug research when multiple drugs are co-prescribed. Prevailing conventional method machine learning (ML) approaches mainly depend on handcraft features and lack generalization. Today, deep learning (DL) techniques that automatically study drug features from drug-related networks or molecular graphs have enhanced the capability of computing approaches for forecasting unknown DDIs. Therefore, in this study, we develop a sparrow search optimization with deep learning-based DDI prediction (SSODL-DDIP) technique for healthcare decision making in big data environments. The presented SSODL-DDIP technique identifies the relationship and properties of the drugs from various sources to make predictions. In addition, a multilabel long short-term memory with an autoencoder (MLSTM-AE) model is employed for the DDI prediction process. Moreover, a lexicon-based approach is involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm is adopted in this work. To assure better performance of the SSODL-DDIP technique, a wide range of simulations are performed. The experimental results show the promising performance of the SSODL-DDIP technique over recent state-of-the-art algorithms.

1. Introduction

In the digital era, the velocity and volume of public, environmental, health, and population data from a wider variety of sources are rapidly developing. Big Data analytics technologies such as deep learning (DL), statistical analysis, data mining (DM), and machine learning (ML) are used to create state-of-the-art decision models [1]. Decision making based on concrete evidence is crucial and has a dramatic effect on program implementation and public health. This highlights the significant role of a decision model under uncertainty, involving health intervention, disease control, health services and systems, preventive medicine, quality of life, health disparities and inequalities, etc. A drug–drug interaction (DDI) can occur when more than one drug is co-prescribed [2]. Even though DDIs might have positive impacts, sometimes they have serious negative impacts and result in withdrawing a drug from the market. DDI prediction could assist in reducing the possibility of adverse reactions and improve the post-marketing surveillance and drug development processes [3]. Medical trials are time consuming and impracticable with respect to dealing with largescale datasets and the limitations of experimental conditions. Hence, researchers have presented a computation method to speed up the process of prediction [4]. The present computation DDI prediction method is divided into five classes of models: DL-based, network-based, similarity-based, literature extraction-based, and matrix factorization-based models.
ML techniques are an emerging area which are employed in large datasets for extracting hidden concepts and relationships amongst attributes [5]. An ML model can be used to forecast outcomes. Since it is extremely complex for humans to process and handle a large amount of data [6], hence, an ML model can play a major role to forecast healthcare outcomes with high quality and cost minimization [7]. ML algorithms are based primarily on rule-based, probability-based, tree-based, etc. methods. Large quantities of data gathered from a variety of sources are applied in the data preprocessing stage. During this stage, data dimension is minimized by eliminating redundant data. As the amount of data increases, a model is not capable of making a decision. Hence, various methods must be developed so that hidden knowledge or useful patterns are extracted from previous information [8]. Then, a model using a ML algorithm is tested under test data to discover the model’s performance, which can be augmented again by considering some rules or parameters. Generally, ML is utilized in the area of prediction, data classification, and pattern recognition [9]. Numerous applications such as disease prediction, face detection, fraud detection, traffic management, and email filtering, use the ML concept. The DL method is part of ML algorithms, which makes use of supervised and unsupervised models for feature classification [10]. The various elements of DL approaches are utilized in the field of recommender systems, disease prediction, and image segmentation such as restricted Boltzmann machines (RBM), convolution neural networks (CNN), and autoencoders (AEs).
In this study, we develop a sparrow search optimization with deep learning-based DDI prediction (SSODL-DDIP) technique for healthcare decision making in big data environments. The presented SSODL-DDIP technique applies a multilabel long short-term memory with an autoencoder (MLSTM-AE) model for the DDI prediction process. Moreover, a lexicon-based approach is involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm is adopted in this work. For ensuring better performance of the SSODL-DDIP technique, a wide range of simulations are performed.

2. Related Works

In [10], the authors proposed a positive unlabeled (PU) learning model which utilized a one-class support vector machine (SVM) model as the learning algorithm. The algorithm could learn the positive distribution from the unified feature vector space of drugs and targets, and regarded unknown pairs as unlabeled rather them labeling them as negative pairs. Wang et al. [11] introduced a novel technique, multi-view graph contrastive representative learning for DDI forecasting, MIRACLE for brevity, for capturing intra-view interactions and inter-view molecular structure among molecules concurrently. MIRACLE treated a DDI network as a multi-view graph in which all nodes in the interaction graph were a drug molecule graph sample. The author employed a bond-aware attentive message propagating algorithm for capturing drug molecular structured data and a graph convolution network (GCN) for encoding DDI relations in the MIRACLE learning phase. Along with that, the author modeled an innovative unsupervised contrastive learning element to integrate and balance multi-view data. In [12], the author devised a deep neural networks (DNNs) method that precisely identified the protein–ligand interactions with particular drugs. The DNN could sense the response of protein–ligand interactions for the particular drugs and could find which drug could effectively combat the virus.
Lin et al. [13] modeled an end-to-end structure, named a knowledge graph neural network (KGNN), for resolving DDI estimation. This structure could capture a drug and its neighborhood by deriving their linked relations in a knowledge graph (KG). For extracting semantic relations and high-order structures of the KG, the author studied the neighborhoods for all entities in KG as its local receptive, and then compiled neighborhood data from representations of the current entities. Pang et al. [14] presented a new attention-system-related multidimensional feature encoder for DDI estimation, called attention-related multidimensional feature encoders (AMDEs). To be specific, in an AMDE, the author encoded drug features from multidimensional features, which included data from an atomic graph of the drug and a simplified molecular-input line-entry system sequence. Salman et al. [15] modeled a DNN-oriented technique (SEV-DDI: Severity-DDI) that included certain integrated units or layers for attaining higher accuracy and precision. The author moved a step further and used the techniques for examining the seriousness of the interaction, after outpacing other methods in the DDI classifier task successfully. The capability to determine DDI severity helps in clinical decision aid mechanisms for making very precise and informed decisions, assuring the patient’s safety.
Liu et al. [16] presented a deep attention neural network-related DDI predictive structure (DANN-DDI), for forecasting unnoticed DDIs. Firstly, by utilizing the graph embedding technique, the author framed multiple drug feature networks and learned drug representation from such networks; after that, the author concatenated learned drug embeddings and implemented an attention neural network for learning representation of drug-drug pairs; finally, the author devised a DNN to precisely estimate DDIs. Zhang et al. [17] introduced a sparse feature learning ensembled approach with linear neighborhood regularization (SFLLN), for forecasting DDIs. Initially, the authors compiled four drug features, i.e., pathways, chemical structures, enzymes, and targets, by mapping drugs in distinct feature spaces into general interaction spaces by sparse feature learning. Then, the authors presented the linear neighborhood regularizations for describing the DDIs in the communication space by utilizing known DDIs.

3. The Proposed Model

In this study, we introduce a novel SSODL-DDIP technique for DDI predictions in big data environments. The presented SSODL-DDIP technique accurately determines the relationship and drug properties from various sources to make predictions. It encompasses data preprocessing, MLSTM-AE-based DDI prediction, SSO hyperparameter tuning, and severity extraction.

3.1. Data Preprocessing

Standard text cleaning and preprocessing operations were carried out on sentences involving but not constrained to lemmatization. Every drug discussed in a sentence was considered and labeled to interact with others [18]. The number of drug pairs (DP) in a sentence is evaluated as follows:
D r u g   P a i r s   ( D P ) =   max   ( 0 , i = 1 n ( i 1 ) )
where n indicates the number of drugs in a sentence.
In addition, drug blinding was used, whereby all the drug names were allocated to the label, for a sentence, “Aspirin might reduce the effect of probenecid”, labeled sentence was “ D r u g A might reduce the effect of D r u g B ”. The drug blinding method assists a technique to identify this label as ”subject” and ”object” that ultimately assist an approach during classification. Then, the processed sentence is given to the approach for classification and detection of DDI.
During word embedding, every word was converted into a real value vector. This word mapping into the matrix can be performed using Word2Vec and embedding data using the abstract of PubMed comprising the drugs.
s i = W E M B v i s
Every sentence is preprocessed and constitutes “ s i ” and “ d j ”, where d j represents drug labels and s i is another word in the sentence. Every word “ s i ” is transformed to the word vectors using the word embedding matrices. Word embedding (WEMB) is an embedding matrix and WEMB d s × | V | whereas V denotes the vocabulary in the training dataset, d s signifies the count of dimensions, and v i s denotes the index of word embedding.

3.2. DDI Prediction Process

To predict the DDI accurately, the MLSTM-AE model is applied in this study. The MLSTM-AE model learns to recreate a time flipped version of input [19]. Every input electricity signal is denoted as χ i = { χ i 1 , χ i 2 , , χ i T } , and is of length T . The hidden state vectors of long short-term memory (LSTM) encoding at P h instant are represented as h i t . The encoder captures relevant data to recreate the input signals. Once it encodes the final point in the input, the h i T hidden state of the encoder is the vector depiction for the input χ i . The decoding has a similar network architecture as the encoding; however, it learns to recreate a flipped version of e input, viz., { χ i T ,   χ i T 1 ,   χ i 1 } . The last hidden state h i T of the encoder can be utilized as the first hidden state of decoding input. The targeted output acts as a flipped version of input, viz., { χ i T , χ i T 1 , , χ i 1 } and the actual recreated one is   { χ ^ , χ ^ , , χ ^ } . The presented method has been demonstrated. Now, the encoder and the decoders are LSTM for modeling dynamic signals. The depiction from the deep layer of the encoder is interconnected with the output label through fully connected networks (FCNs). The reconstruction utilized for training the MLSTM-AE model is formulated by:
L r e c = 1 N i = 1 N ; = 1 T ( x i t x ^ i t ) 2
where N denotes the overall sample count. Because, the final objective of the study is to learn to categorize, the embedding from the h i T hidden layer is passed via a fully connected (FC) layer, the output of which is the class label. The class label is one-hot encoded. The size of the label vector is equivalent to the number of appliances; once an appliance is ON, the corresponding location of the label vector is 1 or else 0 . This can be denoted by y i = { y i 1 , y i 2 ,   , y i c } , considering C appliances. Figure 1 represents the structure of MLSTM.
When the C t h appliance is ON, the corresponding y i c is 1; otherwise it is 0 . The ground-truth probability vector of i t h samples are described as p ^ i = y i / y i 1 . The predicted probability vector can be represented as p i .
L c l s = 1 N i = 1 N c = 1 c ( p i c p ^ i c ) 2
This algorithm has been trained collectively with the reconstruction loss and multilabel classification loss, hence, the overall loss function is formulated by Equation (5):
L = L c l s + γ L r e c

3.3. Hyperparameter Tuning Process

For the hyperparameter tuning process, the SSODL-DDIP technique uses the SSO algorithm. The SSO is a recent metaheuristic approach which stimulates the anti-predatory and predation actions of the sparrow population [20], particularly, in foraging, individual sparrows act in two roles: joiner and discoverer. The discoverer is responsible for searching the food and guiding others, and the joiner forages by following the discoverers. A specific percentage of sparrows has been carefully chosen as the guarder that transmits alarm signals and carries out anti-predation behavior while they realize the danger. The discoverer position can be redeveloped as follows:
X i , j t + 1 = { X i , j t   exp   ( i α T ) R 2 < S T X i , j t + O G R 2 S T
In Equation (6), t is the existing value of update. T presents the maximal value of update. X i j t defines the present position of the i - t h agent. X i j t + 1 denotes the upgraded position of the i - t h sparrow in the j - t h dimension α ( 0 , 1 ] refers to a random number. S T ( 0.5 , 1 ] signifies a safety value. R 2 ( 0 , 1 ] defines a warning value. G denotes a 1 × d matrix where each value is 1. O represents a random variable.
The joiner position is regenerated as follows:
X i , j t + 1 = { O   exp   ( x w X i , j t i 2 ) i > n / 2 X b + | X i , j t X b | B G o t h e r w i s e
In Equation (7), X b signifies the existing optimum position of the discoverer. X w describes the worst position of the sparrow, B denotes the 1 × d matrix where every value is equivalent to 1 or 1 , and A + = A T ( A A T ) 1 . Figure 2 demonstrates the steps involved in the SSO algorithm.
The position regeneration for the guarder can be defined as follows:
X i , j t + 1 = { X b e s t t + β | X i , j t X b e s t t | f t j > f t g X i , j t + K ( X i , i t X w o r s t t ( f t i f t w ) + ε ) f t i = f t g
In Equation (8), X b e s t stand for the best global location. β and K [ 1 , 1 ] represent two random integers; f t i defines the fitness value. f t w and f t g are the present worst and best fitness values in the population, correspondingly; ε indicates a minimal number that is closer to zero as explained in Algorithm 1.
Algorithm 1: Pseudocode of SSO algorithm
Define I t e r max , N P ,   n ,   P d p ,   s f ,   G c ,   F S U and F S L
Arbitrarily initializing the flying squirrels places
F S i , j = F S L + r a n d ( ) ( F S U F S L ) ,   i = 1 , 2 ,   ,   N P ,   j = 1 , 2 ,   ,   n
Compute fitness value
f i = f i ( F S i , 1 , F S i , 2 , ,   F S i , n ) ,   i = 1 , 2 ,   ,   N P
while I t e r < I t e r   max  
[ s o r t e d f ,   s o r t e i n d e x ]   = s o r t ( f )
F S h t = F S ( s o r t e _ i n d e x ( 1 ) )
F S a t ( 1 : 3 ) = F S ( s o r t e i n d e x ( 2 : 4 ) )
F S n t ( 1 : N P 4 ) = F S ( s o r t e _ i n d e x ( 5 : N P ) )
Create novel places
for t = 1 : n 1 ( n 1 = entire count of squirrels on acorn trees)
    if R 1 P d p
F S a t n e w = F S a t o l d + d g G c ( F S h t o l d F S a t o l d )
    else
F S a t n e w = r a n d o m   l o c a t i o n
    end
end
for t = 1 : n 2 ( n 2 = entire count of squirrels on normal trees moving to acorn trees)
    if R 2 P d p
F S n t n e w = F S n t o l d + d g G c ( F S a t o l d F S n t o l d )
    else
F S n t n e w = r a n d o m   l o c a t i o n
    end
end
for t = 1 : n 3 ( n 3 = entire count of squirrels on normal trees moving to hickory trees)
    if R 3 P d p
F S n t n e w = F S n t o l d + d g G c ( F S h t o l d F S n t o l d )
    else
F S n t n e w = r a n d o m   l o c a t i o n
    end
end
S c t = k 1 n ( F S a t k t F S h t k ) 2 ,   S c   min   = 10 B 6 365 I t e r / ( I t e r   max   ) / 2.5
if s c t < s c   min  
F S n t n e w = F S L + L e ´ v y ( n ) × ( F S U F S L )
end
Compute fitness value of novel places
f i = f i ( F S i , 1 n e w , F S i , 2 n e w ,   , F S i , n n e w ) ,   i = 1 , 2 ,   , N P
I t e r = I t e r + 1
end
The SSO algorithm derives a fitness function (FF) for reaching maximum classifier performance. It determines positive values for signifying the superior outcome of the candidate solutions. In this article, the reduction of the classifier error rate is the FF, as presented below in Equation (9):
f i t n e s s ( x i ) = C l a s s i f i e r E r r o r R a t e ( x i )   = n u m b e r   o f   m i s c l a s s i f i e d   s a m p l e s T o t a l   n u m b e r   o f   s a m p l e s 100

3.4. Severity Extraction Process

Lexicons such as Sent WordNet and WordNet Affect are common lexicons that are utilized for extracting common sentiments of texts, for instance, movies and social reviews. The subjectivity lexicon has been utilized for extracting subjective expression in arguments or text statements. Several common and subjectivity lexicons have been changed in medicinal study to distinct healthcare tasks. A wide pharmaceutical lexicon has also progressed specifically to the biomedical and healthcare domains and has been used for extracting the sentiments of clinical and pharmaceutical text. It can extract the polarity of sentences by executing Sent WordNet, and the interface has been classified as low, moderate, or high levels, as dangerous and advantageous DDIs are dependent upon the polarity of candidate sentences.

4. Results and Discussion

The experimental validation of the SSODL-DDIP technique was tested using drug target datasets [10,21]. We used four different datasets to examine the performance of the SSODL-DDIP technique. Table 1 presents the details of the datasets. The distribution of samples under drug, target, and interactions is given in Figure 3.
Table 2 and Figure 4 present the performance of the SSODL-DDIP technique under unlabeled and labeled samples on the top k% values. The results indicate that the SSODL-DDIP technique effectively labeled the samples. For instance, on the top 10% of the enzyme dataset, the SSODL-DDIP technique labeled 317 samples under 29,036 unlabeled samples. Likewise, on the top 10% of the G protein-coupled receptors (GPCR) dataset, the SSODL-DDIP technique labeled 311 samples under 1916 unlabeled samples. Similarly, on the top 10% of the ion channel dataset, the SSODL-DDIP technique labeled 395 samples under 4026 unlabeled samples. Lastly, on the top 10% of the nuclear receptor dataset, the SSODL-DDIP technique labeled 34 samples under 110 unlabeled samples.
Table 3 presents the overall results of the area under the ROC curve (AUC) and the area under the precision-recall curve (AUPR) analysis of the SSODL-DDIP technique on four datasets.
Figure 5 shows the comprehensive AUC values of the SSODL-DDIP technique under different coefficient of variation (CV)_seed values. The figure shows that the SSODL-DDIP technique reached maximum AUC values under all datasets. For instance, on the enzyme dataset, the SSODL-DDIP technique attained higher AUC values of 93.46%, 97.32%, 96.72%, 88.33%, and 97.78% under CV_SEED values of 3201, 2033, 5179, 2931, and 9117, respectively. On the GPCR dataset, the SSODL-DDIP technique attained higher AUC values of 87.09%, 84.82%, 87.17%, 88.50%, and 92.95% under CV_SEED values of 3201, 2033, 5179, 2931, and 9117, respectively.
Figure 6 presents the comprehensive AUPR values of the SSODL-DDIP technique under different CV_seed values. The figure implied that the SSODL-DDIP technique attained maximum AUPR values under all datasets. For example, on the enzyme dataset, the SSODL-DDIP technique attained higher AUPR values of 60.29%, 68.78%, 66.59%, 54.85%, and 71.31% under CV_SEED values of 3201, 2033, 5179, 2931, and 9117 respectively. On the GPCR dataset, the SSODL-DDIP technique attained higher AUPR values of 63.21%, 62.04%, 63.58%, 66.67%, and 68.97% under CV_SEED values of 3201, 2033, 5179, 2931, and 9117 respectively.
Table 4 and Figure 7 show the results of a comparison study of the SSODL-DDIP technique on four datasets in terms of AUC [22,23,24,25]. The experimental values indicate that the SSODL-DDIP technique attained maximum AUC values under all datasets. For instance, on the enzyme dataset, the SSODL-DDIP technique attained a higher AUC value of 97.78%. In contrast, the bigram position-specific scoring matrix (PSSM), neural network (NN), IFB, kernelized Bayesian matrix factorization with twin kernels’ (KBMF2K), drug-based similarity inference (DBSI), and drug–target interaction prediction model using optimal recurrent neural network (DTIP-ORNN) technique attained lower AUC values of 86%, 94.80%, 89.80%, 84.50%, 83.20%, 80.60%, and 96.10% respectively. On the GPCR dataset, the SSODL-DDIP technique attained a higher AUC value of 92.95%. Conversely, the bigram PSSM, NN, IFB, KBMF2K, DBSI, and DTIP-ORNN technique attained lower AUC values of 86%, 87.60%, 88.90%, 88.90%, 81.20%, 85.70%, 80.30%, and 91.53%, respectively.
Table 5 and Figure 8 present a comparative inspection of the SSODL-DDIP technique on four datasets in terms of AUPR. The simulation values indicate that the SSODL-DDIP technique attained maximum AUPR values under all datasets. For instance, on the enzyme dataset, the SSODL-DDIP technique attained a higher AUPR value of 71.31%. In contrast, the bipartite local model (BLM), self-training support vector machine with BLM (SELF-BLM), positive-unlabeled learning with BLM (PULBLM)-3, PULBLM-5, PULBLM-7, and DTIP-ORNN technique attained lower AUPR values of 57.00%, 63.00%, 67.00%, 67.00%, 66.00%, and 69.01% respectively. In addition, on the GPCR dataset, the SSODL-DDIP technique attained a higher AUPR value of 68.97%. In contrast, the bigram BLM, SELF-BLM, PULBLM-3, PULBLM-5, PULBLM-7, and DTIP-ORNN technique attained lower AUPR values of 55.00%, 60.00%, 64.00%, 64.00%, 65.00%, and 67.20%, respectively. These results confirmed the effective DDI prediction results of the SSODL-DDIP technique.

5. Conclusions

In this study, we introduced a novel SSODL-DDIP technique for DDI predictions in big data environments. The presented SSODL-DDIP technique accurately determined the relationship and drug properties from various sources to make a prediction. In addition, the MLSTM-AE model was employed for the DDI prediction process. Furthermore, a lexicon-based approach was involved in determining the severity of interactions among the DDIs. To improve the prediction outcomes of the MLSTM-AE model, the SSO algorithm was adopted in this work. To assure better performance of the SSODL-DDIP technique, a wide range of simulations were performed. The experimental outcomes show the promising performance of the SSODL-DDIP technique over recent state-of-the-art methodologies. Thus, the SSODL-DDIP technique can be employed for improved DDI predictions. In the future, hybrid metaheuristics could be designed to improve the prediction performance. In addition, outlier detection and clustering techniques could be integrated to enhance the predictive results of the proposed model.

Author Contributions

Conceptualization, A.M.H. and M.I.E.; methodology, F.A.; software, M.I.E.; validation, S.S.A., R.M., A.A.A. and M.I.E.; formal analysis, H.M.; investigation, A.A.A.; resources, H.M.; data curation, R.M. and M.I.E.; writing—original draft preparation, A.M.H., F.A., S.S.A., R.M. and H.M.; writing—review and editing, A.A.A., A.E.O. and M.I.E.; visualization, M.I.E. and A.E.O.; supervision, F.A.; project administration, A.M.H.; funding acquisition, F.A. and S.S.A. All authors have read and agreed to the published version of the manuscript.

Funding

Princess Nourah bint Abdulrahman University Researchers Supporting Project number (PNURSP2023R77), Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia. The authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4210118DSR53). This study is supported via funding from Prince Sattam bin Abdulaziz University project number (PSAU/2023/R/1444).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data sharing is not applicable to this article as no datasets were generated during the current study.

Conflicts of Interest

The authors declare that they have no conflict of interest. The manuscript was written through contributions of all authors. All authors have given approval for the final version of the manuscript.

Abbreviations

AbbreviationMeaning
DDIDrug–drug interactions
MLMachine learning
DLDeep learning
SSODL-DDIPSparrow search optimization with deep learning-based DDI prediction
MLSTM-AEMultilabel long short-term memory with an autoencoder
DMData mining
RBMRestricted Boltzmann machines
CNNConvolution neural networks
AEAutoencoder
PUPositive unlabeled
SVMSupport vector machine
GCNGraph convolution network
DNNDeep neural networks
KGNNKnowledge graph neural network
KGKnowledge graph
AMDEAttention-related multidimensional feature encoders
SEV-DDISeverity DDI
DANN-DDIDeep attention neural network-related DDI
SFLLNSparse feature learning ensembled approach with linear neighborhood regularization
DPDrug pairs
WEMBWord embedding
LSTMLong short-term memory
FCNFully connected networks
FCFully connected
FSFlying Squirrels
GPCRG protein-coupled receptors
AUPRArea under the precision-recall curve
AUCArea under the ROC Curve
CVCoefficient of variation
PSSMPosition-specific scoring matrix
NNNeural network
KBMF2KKernelized Bayesian matrix factorization with twin kernels
DBSIdrug-based similarity inference
DTIP-ORNNDrug–target interaction prediction model using optimal recurrent neural network
BLMBipartite local model
SELF-BLMSelf-training support vector machine with BLM
PULBLMPositive unlabeled learning with BLM

References

  1. Ryu, J.Y.; Kim, H.U.; Lee, S.Y. Deep learning improves prediction of drug–drug and drug–food interactions. Proc. Natl. Acad. Sci. USA 2018, 115, E4304–E4311. [Google Scholar] [CrossRef] [PubMed]
  2. Hung, T.N.K.; Le, N.Q.K.; Le, N.H.; Van Tuan, L.; Nguyen, T.P.; Thi, C.; Kang, J.H. An AI-based Prediction Model for Drug-drug Interactions in Osteoporosis and Paget’s Diseases from SMILES. Mol. Inform. 2022, 41, 2100264. [Google Scholar] [CrossRef] [PubMed]
  3. Wang, N.N.; Wang, X.G.; Xiong, G.L.; Yang, Z.Y.; Lu, A.P.; Chen, X.; Liu, S.; Hou, T.J.; Cao, D.S. Machine learning to predict metabolic drug interactions related to cytochrome P450 isozymes. J. Cheminform. 2022, 14, 1–16. [Google Scholar] [CrossRef] [PubMed]
  4. Kastrin, A.; Ferk, P.; Leskošek, B. Predicting potential drug-drug interactions on topological and semantic similarity features using statistical learning. PLoS ONE 2018, 13, e0196865. [Google Scholar] [CrossRef] [PubMed]
  5. Deng, Y.; Xu, X.; Qiu, Y.; Xia, J.; Zhang, W.; Liu, S. A multimodal deep learning framework for predicting drug–drug interaction events. Bioinformatics 2020, 36, 4316–4322. [Google Scholar] [CrossRef] [PubMed]
  6. Kumar, R.; Saha, P. A review on artificial intelligence and machine learning to improve cancer management and drug discovery. Int. J. Res. Appl. Sci. Biotechnol. 2022, 9, 149–156. [Google Scholar]
  7. Lim, S.; Lee, K.; Kang, J. Drug drug interaction extraction from the literature using a recursive neural network. PLoS ONE 2018, 13, e0190926. [Google Scholar] [CrossRef] [PubMed]
  8. Chen, Y.; Ma, T.; Yang, X.; Wang, J.; Song, B.; Zeng, X. MUFFIN: Multi-scale feature fusion for drug–drug interaction prediction. Bioinformatics 2021, 37, 2651–2658. [Google Scholar] [CrossRef] [PubMed]
  9. Vilar, S.; Friedman, C.; Hripcsak, G. Detection of drug–drug interactions through data mining studies using clinical sources, scientific literature and social media. Brief. Bioinform. 2018, 19, 863–877. [Google Scholar] [CrossRef] [PubMed]
  10. Yamanishi, Y.; Araki, M.; Gutteridge, A.; Honda, W.; Kanehisa, M. Prediction of drug–target interaction networks from the integration of chemical and genomic spaces. Bioinformatics 2008, 24, i232–i240. [Google Scholar] [CrossRef] [PubMed]
  11. Wang, Y.; Min, Y.; Chen, X.; Wu, J. Multi-view graph contrastive representation learning for drug-drug interaction prediction. In Proceedings of the Web Conference, Online, 19–23 April 2021; pp. 2921–2933. [Google Scholar]
  12. Yuvaraj, N.; Srihari, K.; Chandragandhi, S.; Raja, R.A.; Dhiman, G.; Kaur, A. Analysis of protein-ligand interactions of SARS-Cov-2 against selective drug using deep neural networks. Big Data Min. Anal. 2021, 4, 76–83. [Google Scholar] [CrossRef]
  13. Lin, X.; Quan, Z.; Wang, Z.J.; Ma, T.; Zeng, X. KGNN: Knowledge Graph Neural Network for Drug-Drug Interaction Prediction. IJCAI 2020, 380, 2739–2745. [Google Scholar]
  14. Pang, S.; Zhang, Y.; Song, T.; Zhang, X.; Wang, X.; Rodriguez-Patón, A. AMDE: A novel attention-mechanism-based multidimensional feature encoder for drug–drug interaction prediction. Brief. Bioinform. 2022, 23, bbab545. [Google Scholar] [CrossRef] [PubMed]
  15. Salman, M.; Munawar, H.S.; Latif, K.; Akram, M.W.; Khan, S.I.; Ullah, F. Big Data Management in Drug–Drug Interaction: A Modern Deep Learning Approach for Smart Healthcare. Big Data Cogn. Comput. 2022, 6, 30. [Google Scholar] [CrossRef]
  16. Liu, S.; Zhang, Y.; Cui, Y.; Qiu, Y.; Deng, Y.; Zhang, Z.M.; Zhang, W. Enhancing drug-drug interaction prediction using deep attention neural networks. IEEE/ACM Trans. Comput. Biol. Bioinform. 2022. [Google Scholar] [CrossRef] [PubMed]
  17. Zhang, W.; Jing, K.; Huang, F.; Chen, Y.; Li, B.; Li, J.; Gong, J. SFLLN: A sparse feature learning ensemble method with linear neighborhood regularization for predicting drug–drug interactions. Inf. Sci. 2019, 497, 189–201. [Google Scholar] [CrossRef]
  18. Rastegar-Mojarad, M.; Boyce, R.D.; Prasad, R. UWM-TRIADS: Classifying Drug-Drug Interactions with Two-Stage SVM and Post-Processing. In Proceedings of the SEM 2013-2nd Joint Conference on Lexical and Computational Semantics, Atlanta, GA, USA, 12 June 2013; Volume 2, pp. 667–674. [Google Scholar]
  19. Verma, S.; Singh, S.; Majumdar, A. Multi-label LSTM autoencoder for non-intrusive appliance load monitoring. Electr. Power Syst. Res. 2021, 199, 107414. [Google Scholar] [CrossRef]
  20. Luan, F.; Li, R.; Liu, S.Q.; Tang, B.; Li, S.; Masoud, M. An Improved Sparrow Search Algorithm for Solving the Energy-Saving Flexible Job Shop Scheduling Problem. Machines 2022, 10, 847. [Google Scholar] [CrossRef]
  21. Available online: http://web.kuicr.kyoto-u.ac.jp/supp/yoshi/drugtarget/ (accessed on 12 September 2022).
  22. Rajpura, H.R.; Ngom, A. Drug target interaction predictions using PU-Leaming under different experimental setting for four formulations namely known drug target pair prediction, drug prediction, target prediction and unknown drug target pair prediction. In Proceedings of the 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), Saint Louis, MO, USA, 30 May–2 June 2018; pp. 1–7. [Google Scholar]
  23. Lan, W.; Wang, J.; Li, M.; Liu, J.; Li, Y.; Wu, F.X.; Pan, Y. Predicting drug–target interaction using positive-unlabeled learning. Neurocomputing 2016, 206, 50–57. [Google Scholar] [CrossRef]
  24. Haddadi, F.; Keyvanpour, M.R. PULBLM: A Computational Positive-Unlabeled Learning Method for Drug-Target Interactions Prediction. In Proceedings of the 10th International Conference on Information and Knowledge Technology (IKT 2019), Tehran, Iran, 31 December 2019. [Google Scholar]
  25. Kavipriya, G.; Manjula, D. Drug–Target Interaction Prediction Model Using Optimal Recurrent Neural Network. Intell. Autom. Soft Comput. 2023, 35, 1677–1689. [Google Scholar] [CrossRef]
Figure 1. Structure of MLSTM.
Figure 1. Structure of MLSTM.
Ijerph 20 02696 g001
Figure 2. Steps involved in the SSO algorithm.
Figure 2. Steps involved in the SSO algorithm.
Ijerph 20 02696 g002
Figure 3. Sample distribution.
Figure 3. Sample distribution.
Ijerph 20 02696 g003
Figure 4. Result analysis of the SSODL-DDIP system: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Figure 4. Result analysis of the SSODL-DDIP system: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Ijerph 20 02696 g004
Figure 5. AUC analysis of the SSODL-DDIP technique under different CV_seed values.
Figure 5. AUC analysis of the SSODL-DDIP technique under different CV_seed values.
Ijerph 20 02696 g005
Figure 6. AUPR analysis of the SSODL-DDIP system under different CV_seed values.
Figure 6. AUPR analysis of the SSODL-DDIP system under different CV_seed values.
Ijerph 20 02696 g006
Figure 7. AUC analysis of the SSODL-DDIP technique: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Figure 7. AUC analysis of the SSODL-DDIP technique: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Ijerph 20 02696 g007
Figure 8. AUPR analysis of the SSODL-DDIP technique: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Figure 8. AUPR analysis of the SSODL-DDIP technique: (a) Enzyme; (b) GPCR; (c) ion channel; (d) nuclear receptor.
Ijerph 20 02696 g008
Table 1. Details on the datasets.
Table 1. Details on the datasets.
DatasetDrugTargetInteractions
Enzyme dataset4456642926
Ion channel dataset2102041467
GPCR dataset22395635
Nuclear receptor dataset542690
Table 2. Analysis results of the SSODL-DDIP technique applied to distinct datasets.
Table 2. Analysis results of the SSODL-DDIP technique applied to distinct datasets.
Enzyme DatasetGPCR Dataset
Top k (%)UnlabeledLabeledTop k (%)UnlabeledLabeled
1029,036317101916311
2058,173478203901461
3087,431510305966497
40116,727516408020541
50145,9735475010,043607
60175,2165786012,097640
70204,4426397014,107690
80233,7186458016,163699
90262,9676829018,214703
100292,20572710020,292719
Ion Channel DatasetNuclear Receptor Dataset
Top k (%)UnlabeledLabeledTop k (%)UnlabeledLabeled
1040263951011034
2080527362024136
3012,2198023037436
4016,4188034050338
5020,35810905063041
6024,45311896076041
7028,59612327088843
8032,713127780101744
9036,823134690114348
10040,9111422100127150
Table 3. AUC and AUPR analysis of the SSODL-DDIP system under distinct datasets.
Table 3. AUC and AUPR analysis of the SSODL-DDIP system under distinct datasets.
Enzyme DatasetGPCR Dataset
CV_SEEDAUCAUPRCV_SEEDAUCAUPR
320193.4660.29320187.0963.21
203397.3268.78203384.8262.04
517996.7266.59517987.1763.58
293188.3354.85293188.5066.67
911797.7871.31911792.9568.97
Ion Channel DatasetNuclear Receptor Dataset
CV_SEEDAUCAUPRCV_SEEDAUCAUPR
320183.7161.46320191.6775.35
203388.1166.96203394.7976.65
517991.9467.55517998.0882.93
293183.9863.18293198.1386.05
911792.0070.34911798.8587.94
Table 4. Comparative analysis of the SSODL-DDIP technique on different datasets in terms of AUC.
Table 4. Comparative analysis of the SSODL-DDIP technique on different datasets in terms of AUC.
MethodsEnzymeGPCRION ChannelNuclear Receptor
UDTPP86.0087.6077.5080.00
Bi-gram PSSM94.8088.9087.2086.90
Nearest neighbor89.8088.9085.2082.00
IFB model84.5081.2073.1083.00
KBMF2K83.2085.7079.9082.40
DBSI80.6080.3080.3075.90
DTIP-ORNN96.1091.5390.1498.72
SSODL-DDIP97.7892.9592.0098.85
Table 5. Comparative analysis of the SSODL-DDIP technique on different datasets in terms of AUPR.
Table 5. Comparative analysis of the SSODL-DDIP technique on different datasets in terms of AUPR.
MethodsEnzymeGPCRION ChannelNuclear Receptor
BLM57.0055.0047.0042.00
SELF-BLM63.0060.0051.0045.00
PULBLM-367.0064.0060.0058.00
PULBLM-567.0064.0061.0059.00
PULBLM-766.0065.0063.0059.00
DTIP-ORNN69.0167.2068.1286.38
SSODL-DDIP71.3168.9770.3487.94
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Alrowais, F.; Alotaibi, S.S.; Hilal, A.M.; Marzouk, R.; Mohsen, H.; Osman, A.E.; Alneil, A.A.; Eldesouki, M.I. Clinical Decision Support Systems to Predict Drug–Drug Interaction Using Multilabel Long Short-Term Memory with an Autoencoder. Int. J. Environ. Res. Public Health 2023, 20, 2696. https://doi.org/10.3390/ijerph20032696

AMA Style

Alrowais F, Alotaibi SS, Hilal AM, Marzouk R, Mohsen H, Osman AE, Alneil AA, Eldesouki MI. Clinical Decision Support Systems to Predict Drug–Drug Interaction Using Multilabel Long Short-Term Memory with an Autoencoder. International Journal of Environmental Research and Public Health. 2023; 20(3):2696. https://doi.org/10.3390/ijerph20032696

Chicago/Turabian Style

Alrowais, Fadwa, Saud S. Alotaibi, Anwer Mustafa Hilal, Radwa Marzouk, Heba Mohsen, Azza Elneil Osman, Amani A. Alneil, and Mohamed I. Eldesouki. 2023. "Clinical Decision Support Systems to Predict Drug–Drug Interaction Using Multilabel Long Short-Term Memory with an Autoencoder" International Journal of Environmental Research and Public Health 20, no. 3: 2696. https://doi.org/10.3390/ijerph20032696

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop