A Review on Evolution of Emerging SARS-CoV-2
Variants based on Spike Glycoprotein


Nimisha Ghosh1,+, Suman Nandi 2+, Indrajit Saha2+,*


1Faculty of Mathematics, Informatics, and Mechanics, University of Warsaw, Warsaw, Poland
2Department of Computer Science and Engineering, National Institute of Technical Teachers' Training and Research, Kolkata, India
*Correspondence should be addressed to team leader : indrajit@nitttrkol.ac.in
+These team members contributed equally to this work



ABSTRACT

Since the inception of SARS-CoV-2 in Wuhan in December 2019, many variants of the virus have emerged over time. Some of these variants result in transmissibility changes of the virus and may also have impact on diagnosis, therapeutics and even vaccines, thereby raising particular concerns in the scientific community. The lineages or variants which have mutations in spike glycoprotein are the primary focus as it is the main target for neutralising antibodies. SARS-CoV-2 is known to infect human using spike glycoprotein through receptor-binding domain to bind to the human ACE2 receptor. Thus, it is of utmost importance to study these variants and their corresponding mutations in spike glycoproteins to understand their characteristics. Such 11 different important variants identified so far are B.1.1.7 (Alpha), B.1.351 (Beta), B.1.525 (Eta), B.1.427/B.1.429 (Epsilon), B.1.526 (Lota), B.1.617.1 (Kappa), B.1.617.2 (Delta), C.37 (Lambda), P.1 (Gamma), P.2 (Zeta) and P.3 (Theta). These variants have 61 unique mutations in spike glycoprotein. Some of the notable mutations out of these 61 are K417N, L452R, S477N, E484K/Q, N501Y, D614G, P681H/R, Y144-, H69- and V70-. To analyse these mutations in spike glycoprotein, multiple sequence alignment of 77681 SARS-CoV-2 genomes of 98 countries over the period from January 2020 to July 2021 is performed using MAFFT followed by phylogenetic analysis. Also, characteristics of new emerging variants are elaborately discussed. Thereafter, the individual evolution of these mutation points as well as pertaining to the respective variants are visualised and their characteristics are also reported. Moreover, to judge the characteristics of the non-synonymous mutation points (substitutions), their protein structural stability are evaluated using I-Mutant 2.0. Thus, this work provides a comprehensive review of the emerging variants and the characteristics of the corresponding mutation points along with the effects of vaccine and therapeutics on the variants.

Supplementary


dataset


code


The algorithm is implemented in MATLAB. The code is available in zipped form here. Use of code/technique/algorithm is free as long as it is used for any academic and non-commercial purpose. If you use this code/technique/algorithm, please cite this work.

For any query regarding the algorithms, please mail to indrajit@nitttrkol.ac.in

Disclaimer:
The dataset is used from public database like GISAID to conduct this reseach. Thus, NITTTR, Kolkata does not own any responsibility.