Navigate this Article


Method for Analysing Protein Structure of SARS-COV-2 M Protein for Possible Clues Regarding Virion Stability, Longevity and Transmission   

Saam HasanSaam Hasan*Muhammad Maqsud HossainMuhammad Maqsud Hossain*  (*contributed equally to this work)
How to cite Favorites Q&A Share your feedback Cited by


The Severe Acute Respiratory Syndrome Coronavirus 2 or SARS-CoV-2 has been the causative agent behind the 2020 global pandemic. With the numbers infected rising well above a 2.7 million and confirmed deaths above 122,000 as of 25th April 2020, it has become the paramount health concern for the global community at present. The SARS-CoV-2 genome has since been sequenced and its proteome identified. In this study, we looked the SARS-CoV-2 M protein and compare them to its close relative, the Severe Acute Respiratory Syndrome-Related Coronavirus. The M protein is known to play a significant role in the virion structure of coronaviruses. The rationale behind this study was that since the major risk factor associated with SARS-CoV-2 has been its rapid transmission, we wished to focus on the viral structure and architecture to look for clues that may indicate structural stability, thus prolonging the time span for which it can survive free of a host. We carried out protein sequence alignment between the M proteins from the two viruses using NCBI BLAST and subsequent 3D structure prediction using SWISS-MODEL. Afterwards we analysed the specific amino acid changes and how they may account for functional differences between the two. As a result of the study, we found some rather interesting differences between the M protein for SARS-CoV-2 and the SARS-CoV virus M protein. This included amino acid changes from non-polar to polar residues in regions important for anchoring the protein in the envelope membrane.

Keywords: Proteomics, Systems Biology, M protein, SARS-CoV-2, SARS-CoV


The SARS-CoV-2 Novel corona virus has been the cause of a worldwide pandemic in 2020. The novel strain, originating from Wuhan, China has, as of the 25th April World Health Organization report, infected over 2.7 million people and led to the deaths of over 187,000 individuals worldwide (Coronavirus disease (COVID-2019) situation report-96, 2020). It has led to considerable disruptions in the economic and psychological welfare of the world population, on top of the continuously increasing loss of life (Chen et al., 2020). Efforts at identifying possible solutions to combating the virus have thus far ranged from basic precautionary awareness campaigns to the suggestions of nucleotide based drugs currently being tested for other diseases as possible therapeutic agents (Martinez, 2020). The genome of the SARS-CoV-2 novel coronavirus has been decoded and its expected proteome already established (RefSeq ID: NC_045512.2). However up until this point, to the best of our knowledge not much progress has been made in terms of identifying unique molecular features that are exclusive to this strain of the Coronavirus. Hence, in this study we made an attempt at studying the protein sequences of the major SARS-CoV-2 architectural protein in order to identify unique advantageous features. We chose to focus our efforts on the M Protein as it is the most abundant protein in the viral nucleocapsid and is believed to be responsible for maintaining the virion in its characteristic shape (Nal, 2005). This forms the basis of our investigative rationale; is there any particular feature that aids the spread of this strain. The percentage of infected patients who have died from this virus has been estimated to be approximately 6.9% percent as of this writing (WHO, 2020b). Furthermore the primary susceptible groups are the immunocompromised. This leads us to believe that the primary focus needs to be on the mechanism of spreading, as the rapid rates at which this strain has infected communities across the world has arguably been its most dangerous aspect. The coronaviruses are known to contain four main structural proteins, the S, M, E, and N proteins (Fehr and Perlman, 2015). The S or the spike protein is responsible for gaining entry into the endoplasmic reticulum following infection (Delmas and Hurst, 2009; Beniac et al., 2006). The N protein resides inside the nucleocapsid and is an RNA binding protein (Chang et al., 2005; Hurst et al., 2009). The M and E proteins are both transmembrane proteins. The M protein, a 3 pass transmembrane protein is particularly significant as it is believed to be responsible for maintaining and giving the virion its shape (Armstrong et al., 1984). The reasoning behind developing this protocol lies in the similarity in structure and function between SARS-CoV and SARS-CoV-2. The former was shown to be ineffective at transmission, thus limiting the epidemic caused by it (Peiris et al., 2003). As virion structure is a vital factor in determining how long a virus can survive free of a host and thus spread, we believe a comparison between the key architectural protein within the two viruses can provide important clues towards understanding SARS-CoV-2’s more efficient mechanism of transmission.


  1. NCBI Blast (



  1. Selecting Target Protein

    Target protein was selected based on its importance to virion structure. M Protein is the most abundant protein in viral nucleocapsid. It plays a key role in maintenance of virion shape and architecture. Hence it was chosen as the target of analysis based on the fact that its unique features may play a crucial role in facilitating the survival and transmission of the virion.

  2. Sequence Alignment

    1. Refseq M protein amino acid sequences were obtained from NCBI Protein.

    2. SARS-CoV-2 and SARS-CoV M protein sequences were aligned using Blastp. All parameters were kept as default.

    3. Amino acid substitutions between the two proteins were checked and correlated with their functional significance to the protein.

  3. Protein Modelling

    3D structures for both proteins were predicted using SWISS-MODEL. It should be noted that reference structures exist for the SARS-CoV M protein but we chose to put it through SWISS-MODEL so we could compare the two proteins after they had been modelled by the same tool, so as to eliminate any discrepancy resulting from use of different software. Homology modelling was used. At first a template search was carried out to identify known structures similar to our proteins of interest. The Bidirectional sugar transporter SWEET2b (PDB accession 5ctg.1.A for SARS-CoV-2 M protein and 5ctg.1.B for the SARS-CoV M protein) was eventually used as the template for building our models. The percentage identity was 14.29 for both our proteins, along with a GMQE of 0.08 and QSQE of 0.00.

Data analysis

  1. We obtained the protein sequences for the SARS Related Coronavirus as well as SARS-COV-2 and did a multiple sequence alignment using NCBI Blast (Altschul et al., 1990). Figure 1 shows the results of the alignment. Afterwards we took the sequences from the both proteins and attempted to predict their structures using SWISS-MODEL (Biasini et al., 2014).

        The E value for the alignment was 3e-155, indicating very high similarity. The number of identical amino acids was 201, the number of positive hits was 214, and finally a single gap was also observed. As Figure 1 shows, there were a total of 20 mismatches and 1 gap between the two sequences. What caught our eye most was the fact that there were multiple amino acid substitutions in the transmembrane domains where bulkier non-polar amino acids in the SARS-CoV protein were replaced by more polar or less bulky ones in the SARS-COV-2 protein. In addition, we found a serine inserted into position 4 of the SARS-COV-2 M protein which its SARS Related Coronavirus counterpart did not have.

    Figure 1. BLAST Multiple Sequence Protein alignment of SARS-CoV-2 M Protein with SARS-CoV M Protein.
    SARS-CoV-2 M protein has a serine inserted at position 4 which the SARS-CoV protein does not contain. In addition to that, we can observe a number of substitutions in the transmembrane regions. At position 33 the SARS-CoV-2 protein has a cysteine, the equivalent position on the SARS-CoV protein is occupied by a methionine. At position 30, SARS-COV-2 contains a threonine, the corresponding position on SARS-CoV has an alanine. Then at position 76 on the second transmembrane domain, the SARS-CoV protein contained a valine, compared to an isoleucine on the SARS-COV-2 protein.

    Figure 2. Location of amino acid substitutions and insertions within the SARS-CoV M protein.

    Serine provides an extra OH group near the N terminal end of the protein which we know makes up its ectodomain that is glycosylated (Nal et al., 2005). The ectodomain of course is the portion of the protein that is found beyond the envelope membrane and on the outside of the virion. Coming over to the transmembrane regions, at position 33 the SARS-CoV-2 has a cysteine, the equivalent position on SARS-CoV is occupied by a methionine. At position 30, SARS-CoV-2 contains a threonine, the corresponding position on SARS-CoV has an alanine. Then at position 76 on the second transmembrane domain, the SARS-CoV protein contained a valine, compared to an isoleucine on the SARS-CoV-2 protein. Starting from the latter, isoleucine has a bulkier sidechain compared to valine, which in the hydrophobic interior of an envelope membrane could produce more hydrophobic interactions with the fatty acid chains of the lipid bilayer, possibly stabilizing the membrane to a greater degree. The methionine to cysteine and alanine to threonine substitutions could also impart potential functional changes. Both would replace non-polar sidechains with polar groups. This may seem counterintuitive for stabilizing a membrane structure, however this could theoretically allow for inter-chain bonds between protein molecules.

        The OH group introduced by Threonine could contribute to hydrogen bonding, while the thiol group introduced by Cysteine could give rise to a disulphide bond. We believe this may hold special importance, especially since our modelling run predicted the SARS-CoV-2 M Protein to form homo-trimers, whereas the SARS-CoV M protein is known to form homo-dimers (Godet et al., 1992). Figures 3A and 3B show the predicted structures for SARS-COV-2 and SARS-CoV M Protein respectively.

    Figure 3. A. SARS-COV-2 M Protein predicted 3D structure. Based on homology modelling, it is predicted to form a homo-trimer as its most likely and stable quaternary structure B. SARS-CoV M Protein 3D structure. Using the same modelling technique, this is predicted to only form a homo-dimer comprising 2 subunits.

  2. Lastly we looked up known protein domains and epitopes within the SARS Related Coronavirus M Protein and scrutinized for any changes in those regions for the SARS-COV-2 M Protein. The most significant epitope we found and considered was the Mn2 epitope that binds to human Cytotoxic T Lymphocytes (Liu et al., 2010). However there were no amino acid substitutions in that particular epitopic region for the SARS-CoV-2 protein.


The goal of this study was to design a simple method for analysing a viral structural protein and understanding the role it may play in the survival and stability of the virion. The two main requirements for using this protocol is A) a structural protein that is known to play a significant role in viral architecture and B) a close relative of the virus that expresses the same protein but is known to be less easily transmittable. If both of these are available, using this method can provide a simple framework for making an initial deduction on the importance of said structural protein. As for the results of this particular study, they lead us to a possible answer to the question we posed at the very start. We believe there might be a possibility of additional bonding interactions in the SARS-CoV-2 M protein that allows its structure to remain more stable and survive for longer. In general, it is what one would expect to happen with the introduction of additional protein-protein bonding between polypeptide chains that transverse a lipid bilayer (Alberts, 2008). The Serine at the N terminus is another interesting observation. The N terminal of this protein is expected to be on the outside of the envelope (Nal et al., 2005) and possibly be exposed to the air and outside interactions.

    Something that has come up in recent news reports is the ability of the SARS-CoV-2 viral particles to survive for longer on metal surfaces, although the actual study has not yet been published (Kumar and Salzman, 2020). Metals have lattice structures where the positive ions are in the middle and the electrons delocalised and free to move about (Atkins and Shriver, 2009). A polar group such as an –OH where the hydrogen atom would likely have a partial positive charge, may open the possibility for some kind of bio-electrostatic association that could help the virion to adhere to these surfaces. Although we are not sure as to how or if it could preserve the architectural integrity of the virion.

    The major difference between the SARS-CoV and the SARS-CoV-2 outbreaks is the spread of the latter. SARS-CoV only effected 26 counties, with around 8000 confirmed cases (WHO, 2020a [WHO | SARS (Severe Acute Respiratory Syndrome)]). Granted, this needs to factor in the increased air travel in recent times, which, as per Statista is almost double it was in 2003 when the SARS epidemic happened. The information for that can be found here; Nonetheless, based on evidence and our molecular level analysis, we believe the M protein should be a candidate into future investigations that could shed more light on how the SARS-CoV-2 virus survives and functions.


We would like to acknowledge the North South University Department of Biochemistry and Microbiology for supporting us.

Competing interests

The authors declare no competing interests.


  1. Alberts, B., (2008). Molecular biology of the cell. New York: Garland Science Taylor& Francis..
  2. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. J. (1990). Basic local alignment search tool. J Mol Biol 215(3): 403-410.
  3. Armstrong, J., Niemann, H., Smeekens, S., Rottier, P. and Warren, G. (1984). Sequence and topology of a model intracellular membrane protein, E1 glycoprotein, from a coronavirus. Nature 308(5961): 751-752.
  4. Atkins, P. and Shriver, D. (2009). Shriver& Atkins' Inorganic Chemistry. p.74.
  5. Beniac, D. R., Andonov, A., Grudeski, E. and Booth, T. F. (2006). Architecture of the SARS coronavirus prefusion spike. Nat Struct Mol Biol 13(8): 751-752.
  6. Biasini, M., Bienert, S., Waterhouse, A., Arnold, K., Studer, G., Schmidt, T., Kiefer, F., Cassarino, T., Bertoni, M., Bordoli, L. and Schwede, T. (2014). SWISS-MODEL: Modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res 42(W1): W252-W258.
  7. Chang, C. K., Sue, S. C., Yu, T. H., Hsieh, C. M., Tsai, C. K., Chiang, Y. C., Lee, S. J., Hsiao, H. H., Wu, W. J., Chang, W. L., Lin, C. H. and Huang, T. H. (2006). Modular organization of SARS coronavirus nucleocapsid protein. J Biomed Sci 13(1): 59-72.
  8. Chen, Q., Quan, B., Li, X., Gao, G., Zheng, W., Zhang, J., Zhang, Z., Liu, C., Li, L., Wang, C., Zhang, G., Li, J., Dai, Y., Yang, J. and Han, W. (2020). A report of clinical diagnosis and treatment of nine cases of coronavirus disease. J Med Virol 92(6): 683-687.
  9. Delmas, B. and Laude, H. (1990). Assembly of coronavirus spike protein into trimers and its role in epitope expression. J Virol 64(11): 5367-5375.
  10. Fehr, A. R. and Perlman, S. (2015). Coronaviruses: An Overview of Their Replication and Pathogenesis. In: Maier, E., Bickerton, E. and Britton, P. (Eds.). Coronaviruses: Methods and Protocols vol 1282. Humana Press, New York, NY.
  11. Godet, M., L'Haridon, R., Vautherot, J.-F. and Laude, H. (1992). TGEV corona virus ORF4 encodes a membrane protein that is incorporated into virions. Virology 188(2): 666-675.
  12. Hurst, K. R., Koetzner, C. A. and Masters, P. S. (2009). Identification of in vivo-interacting domains of the murine coronavirus nucleocapsid protein. J Virology 83(14): 7221-7234.
  13. Kumar, V. and Salzman, S. (2020). SARS-COV-2 can last a few days on surfaces, according to new experiment findings. ABC News .
  14. Liu, W., Sun, Y., Qi, J., Chu, F., Wu, H., Gao, F., Li, T., Yan, J. and Gao, G. (2010). The membrane protein of severe acute respiratory syndrome coronavirus acts as a dominant immunogen revealed by a clustering region of novel functionally and structurally defined cytotoxic t-lymphocyte epitopes. J Infect Dis 202: 1171-1180.
  15. Martinez, M. A., (2020). Compounds with therapeutic potential against novel respiratory 2019 coronavirus. Antimicrob Agents Chemother AAC.00399-00320.
  16. Nal, B., Chan, C. M., Kien, F., Siu, L., Tse, J., Chu, K., Kam, Y. W., Staropoli, I., Crescenzo-Chaigne, B., Escriou, N., Werf, S., Yuen, K. Y. and Altmeyer, R. (2005). Differential maturation and subcellular localization of severe acute respiratory syndrome coronavirus surface proteins S, M and E. J Gen Virol 86: 1423-1434.
  17. Peiris, J., Yuen, K., Osterhaus, A. and Stöhr, K. (2003). The Severe Acute Respiratory Syndrome. New England J Medicine 349(25): 2431-2441.
  18. (2020). WHO | SARS(Severe Acute Respiratory Syndrome).[online] Available at: <>[Accessed 26 April 2020].
  19. (2020). Coronavirus Disease(COVID-2019) Situation Report-96.[online] Available at:<>[Accessed 26 April 2020].
Please login or register for free to view full text
Copyright: © 2022 The Authors; exclusive licensee Bio-protocol LLC.
How to cite: Hasan, S. and Hossain, M. M. (2022). Method for Analysing Protein Structure of SARS-COV-2 M Protein for Possible Clues Regarding Virion Stability, Longevity and Transmission. Bio-101: e3830. DOI: 10.21769/BioProtoc.3830.

If you have any questions/comments about this protocol, you are highly recommended to post here. We will invite the authors of this protocol as well as some of its users to address your questions/comments. To make it easier for them to help you, you are encouraged to post your data including images for the troubleshooting.

If you have any questions/comments about this protocol, you are highly recommended to post here. We will invite the authors of this protocol as well as some of its users to address your questions/comments. To make it easier for them to help you, you are encouraged to post your data including images for the troubleshooting.

We use cookies on this site to enhance your user experience. By using our website, you are agreeing to allow the storage of cookies on your computer.