COVID-19-SARS-CoV-2

about project

In the early days of COVID-19 pandemic in India, CRG Short-term Special call on COVID-19 was launched by the Science and Engineering Research Board (SERB), a statutory body under the Department of Science and Technology (DST), Govt. of India. In response to this, the project called "In Silico Analysis of 10000 Genomic Sequences of COVID-19 around the World including India to Identify Genetic Variability and potential Molecular Targets in Virus and Human" (project No.: CVD/2020/000991) was approved for a year (July-2020 to 2021). The primary objectives of this project were to (a) identify the genetic variability in SARS-CoV-2 genomes around the globe including India, (b) identify the number of virus strains using Single Nucleotide Polymorphism (SNP) data, (c) identify the putative Epitopes as candidates of synthetic vaccine based on genomic conserved regions that is highly immunogenic and antigenic, (d) identify the potential target proteins of the virus and human host based on Protein-Protein Interactions as well as by integrating the knowledge of genetic variability. In addition to these, other objectives like (e) prediction of Coronavirus from other pathogenic viruses using machine learning, and (f) identification of virus miRNAs that are also involved in regulating human mRNA or vice-versa were also considered to explore the challenges of COVID-19 from multiple directions in order to give a best possible answer to combat the spread of SARS-CoV-2.

highlights

Whole genome analysis of more than 10000 SARS-CoV-2 virus unveils global genetic diversity and target region of NSP6

Whole genome analysis of SARS-CoV-2 is important to identify its genetic diversity. Moreover, accurate detection of SARS-CoV-2 is required for its correct diagnosis. To address these, first we have analysed publicly available 10 664 complete or near-complete SARS-CoV-2 genomes of 73 countries globally to find mutation points in the coding regions as substitution, deletion, insertion and single nucleotide polymorphism (SNP) globally and country wise. In this regard, multiple sequence alignment is performed in the presence of reference sequence from NCBI. Once the alignment is done, a consensus sequence is build to analyse each genomic sequence to identify the unique mutation points as substitutions, deletions, insertions and SNPs globally, thereby resulting in 7209, 11700, 119 and 53 such mutation points respectively. [Read More]

Genome-wide analysis of 10664 SARS-CoV-2 genomes to identify virus strains in 73 countries based on single nucleotide polymorphism

Since the onslaught of SARS-CoV-2, the research community has been searching for a vaccine to fight against this virus. However, during this period, the virus has mutated to adapt to the different environmental conditions in the world and made the task of vaccine design more challenging. In this situation, the identification of virus strains is very much timely and important task. We have performed genome-wide analysis of 10664 SARS-CoV-2 genomes of 73 countries to identify and prepare a Single Nucleotide Polymorphism (SNP) dataset of SARS-CoV-2. Thereafter, with the use of this SNP data, the advantage of hierarchical clustering is taken care of in such a way so that Average Linkage and Complete Linkage with Jaccard and Hamming distance functions are applied separately in order to identify the virus strains as clusters present in the SNP data. [Read More]

Immunogenicity and antigenicity based T-cell and B-cell epitopes identification from conserved regions of 10664 SARS-CoV-2 genomes

The surge of SARS-CoV-2 has created a wave of pandemic around the globe due to its high transmission rate. To contain this virus, researchers are working around the clock for a solution in the form of vaccine. Due to the impact of this pandemic, the economy and healthcare have immensely suffered around the globe. Thus, an efficient vaccine design is the need of the hour. Moreover, to have a generalised vaccine for heterogeneous human population, the virus genomes from different countries should be considered. Thus, in this work, we have performed genome-wide analysis of 10,664 SARS-CoV-2 genomes of 73 countries around the globe in order to identify the potential conserved regions for the development of peptide based synthetic vaccine viz. epitopes with high immunogenic and antigenic scores. In this regard, multiple sequence alignment technique viz. Clustal Omega is used to align the 10,664 SARS-CoV-2 virus genomes. [Read More]

COVID-DeepPredictor: Recurrent Neural Network to Predict SARS-CoV-2 and Other Pathogenic Viruses

The COVID-19 disease for Novel coronavirus (SARS-CoV-2) has turned out to be a global pandemic. The high transmission rate of this pathogenic virus demands an early prediction and proper identification for the subsequent treatment. However, polymorphic nature of this virus allows it to adapt and sustain in different kinds of environment which makes it difficult to predict. On the other hand, there are other pathogens like SARS-CoV-1, MERS-CoV, Ebola, Dengue, and Influenza as well, so that a predictor is highly required to distinguish them with the use of their genomic information. To mitigate this problem, in this work COVID-DeepPredictor is proposed on the framework of deep learning to identify an unknown sequence of these pathogens. COVID-DeepPredictor uses Long Short Term Memory as Recurrent Neural Network for the underlying prediction with an alignment-free technique. [Read More]

Publications

N. Ghosh, I. Saha, N. Sharma, A. Gambin, "Interactome-Based Machine Learning Predicts Potential Therapeutics for COVID-19", ACS Omega, Vol. 08, pp. 13840-13854, 2023. [Impact Factor: 4.13] [Source Link]
N. Ghosh, I. Saha, and D. Plewczynski, "Unveiling the Biomarkers of Cancer and COVID-19 and Their Regulations in Different Organs by Integrating RNA-Seq Expression and Protein-Protein Interactions", ACS Omega, Vol. 07, pp. 43589-43602, 2022. [Impact Factor: 4.13] [Source Link]
N. Ghosh, S. Nandi, I. Saha, "A Review on Evolution of Emerging SARS-CoV-2 Variants based on Spike Glycoprotein", International Immunopharmacology, Vol. 105, pp. 108565, 2022. [Impact Factor: 4.932] [Source Link]
N. Ghosh, I. Saha, N. Sharma, "Palindromic Target Site Identification in SARS-CoV-2, MERS-CoV and SARS-CoV-1 by Adopting CRISPR-Cas Technique", Gene, Vol. 818, pp. 146136, 2022. [Impact Factor: 3.688] [Source Link]
N. Ghosh, I. Saha, S. Nandi, N. Sharma, "Characterisation of SARS-CoV-2 Clades based on Signature SNPs unveils Continuous Evolution", Methods, Vol. 203, pp. 282-296, 2022. [Impact Factor: 4.647] [Source Link]
D. Santoni, N. Ghosh, I. Saha, "An entropy-based study on mutational trajectory of SARS-CoV-2 in India", Infection, Genetics and Evolution, Vol. 97, pp. 105154, 2022. [Impact Factor: 3.342] [Source Link]
I. Saha, N. Ghosh, N. Sharma, S. Nandi, "Hotspot Mutations in SARS-CoV-2", Frontiers in Genetics, Vol. 12, pp. 753440, 2021. [Impact Factor: 4.599] [Source Link]
N. Ghosh, I. Saha, N. Sharma, "Interactome of Human and SARS-CoV-2 Proteins to Identify Human Hub Proteins Associated with Comorbidities", Computers in Biology and Medicine, Vol. 138, pp. 104889, 2021. [Impact Factor: 4.589] [Source Link]
N. Ghosh, I. Saha, J. P. Sarkar, U. Maulik, "Strategies for COVID-19 Epidemiological Surveillance in India: Overall Policies till June 2021", Frontiers in Public Health, Vol. 9, pp. 708224, 2021. [Impact Factor: 3.709] [Source Link]
N. Ghosh, N. Sharma, I. Saha, "Immunogenicity and Antigenicity based T-cell and B-cell Epitopes Identification from Conserved Regions of 10664 SARS-CoV-2 Genomes", Infection, Genetics and Evolution, Vol. 92, pp. 104823, 2021. [Impact Factor: 3.342] [Source Link]
N. Ghosh, I. Saha, N. Sharma, S. Nandi, D. Plewczynski, "Genome-wide Analysis of 10664 SARS-CoV-2 Genomes to Identify Virus Strains in 73 Countries based on Single Nucleotide Polymorphism", Virus Research, Vol. 298, pp. 198401, 2021. [Impact Factor: 3.303] [Source Link]
I. Saha, N. Ghosh, A. Pradhan, N. Sharma, D. Maity and K. Mitra, "Whole Genome Analysis of more than 10000 SARS-CoV-2 Virus Unveils Global Genetic Diversity and Target region of NSP6", Briefings in Bioinformatics, Vol. 22, pp. 1106-1121, 2021. [Impact Factor: 11.622] [Source Link]
I. Saha , N. Ghosh, D. Maity, A. Seal and D. Plewczynski, "COVID-DeepPredictor: Recurrent Neural Network to Predict SARS-CoV-2 and Other Pathogenic Viruses", Frontiers in Genetics, Vol. 12, pp. 569120, 2021. [Impact Factor: 4.599] [ [Source Link]
J. P. Sarkar, I. Saha , D. Maity, A. Seal and U. Maulik, "Topological Analysis for Sequence Variability: Case Study on more than 2K SARS-CoV-2 sequences of 54 countries in comparison with SARS-CoV-1 and MERS-CoV", Infection, Genetics and Evolution, Vol. 88, pp. 104708, 2020. [Impact Factor: 3.342] [Source Link]
N. Ghosh, N. Sharma, I. Saha and S. Saha, "Genome-wide Analysis of Indian SARS-CoV-2 Genomes to Identify T-cell and B-cell Epitopes from Conserved Regions based on Immunogenicity and Antigenicity", International Immunopharmacology, Vol. 91, pp. 107276, 2020. [Impact Factor: 4.932] [Source Link]
I. Saha , N. Ghosh, D. Maity, N. Sharma and K. Mitra, "Inferring the genetic variability in Indian SARS-CoV-2 genomes using consensus of multiple sequence alignment techniques", Infection, Genetics and Evolution, Vol. 75, pp. 104522, 2020. [Impact Factor: 3.342] [Source Link]
I. Saha , N. Ghosh, D. Maity, N. Sharma, J. P. Sarkar and K. Mitra, "Genome-wide analysis of Indian SARS-CoV-2 genomes for the identification of genetic mutation and SNP", Infection, Genetics and Evolution, Vol. 85, pp. 104457, 2020. [Impact Factor: 3.342] [Source Link]

Dr. Indrajit Saha Principal Investigator	Dr. Nimisha Ghosh Project Colleborator	Nikhil Sharma Project Intern
Jnanendra Prasad Sarkar Ph.D Student	Suman Nandi Junior Research Fellow	Debasree Maity Project Intern

about project

highlights

Publications

MEMBERS