An explainable phosphorylation peptide associated with SARS-CoV-2 infection employing a 2D Convolutional Neural Network (2DCNN)
DOI:
https://doi.org/10.63163/jpehss.v3i1.157Keywords:
Phosphorylation sites, SARS-COV-2, 2DCNN, DDE, 2DEEP_IPs.Abstract
Phosphorylation is a post-translational modification process plays a critical role in the regulation of many cellular processes, including viral infection for example SARS-CoV-2. The SARS-CoV-2 is the virus responsible for causing the COVID-19 pandemic. The identification and characterization of phosphorylation sites on SARS-CoV-2 proteins could provide valuable insights into the mechanisms underlying the virus's pathogenesis and may lead to the development of new therapeutic strategies for COVID-19. The development of computational predictors for phosphorylation site identification has received remarkable attention recently, however these methods limited to find phosphorylation sites in SARS-CoV-2-infected host cells. Viral-host protein-protein interactions cause alterations in phosphorylation and may influence host protein subcellular localization. In this work we proposed a predictor called 2Deep-IPs using two-dimensional convolutional deep neural network (2D-CNN) for identification of particular phosphorylation sites. We extracted the amino acid composition-based features from protein sequence by using dipeptide deviation from expected mean (DDE) descriptor. Further, we used shapely additive explanation’s (SHAP's) algorithm to rank the effective attributes that adequately contain crucial biological information. The proposed model outperformed on top 15 high ranked features. The empirical outcomes of 2Deep-IPs based on 10- fold cross-validation achieved accuracy score 96.71, Sen score obtained 94.46 and Spec score obtain is 99.69 and MCC score obtain 0.939. The results analysis based on independent datasets achieved accuracy score 95.70, Sen score obtained 97.83 and Spec score obtain is 91.89 and MCC score obtain 0.782, respectively. Thus, the anticipated results reveal that 2Deep-IPs outperforms other phosphorylation sites predictors both on cross-validation and independent test respectively. We hope that the proposed Deep-IPs will provide in-depth knowledge to other methods that can be used to predict general phosphorylation sites.