Assembling the Viral Macromolecular Model

Piecing the puzzle together.

Covid-19 Home Introduction Background Method Downloads Models

The Results from the Pieces

Having established the component parts of the virion model. The next stage was arranging them. The SARS-CoV-2 virion is flexible in shape but roughly spherical, measuring around 100 nanometres (nm) in diameter (1). Estimates for the size vary between 60-120nm (2), but average estimates converge at 80-100nm. 

Using our 7 by 7 nm (70 by 70 angstrom) tile we arrange 44 copies end to end in a circle, the horizontal plane of each tile was rotated by around 8.14 degrees from the next tile. This created a “circle” measuring 98 nm in diameter. Two copies of the “circle” were created and suitable rotated to give the Y- and Z-axis’ of the “sphere”. The approximated sphere was treated as a series of approximated “circles”, that is sliced horizontally into 21 pieces – as one might stack an extravagant many teared wedding cake. The number of lipid bilayer tiles to each “circle” was sequentially 2, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 40, 36, 32, 28, 24, 20, 16, 12, 8, and 2. This created an approximated sphere defined by 480 tiles each was given a sequential address. Each tile had a surface are of 7 x 7 nm for a total defined surface area of 23520nm2 this meant nearly 78% of the theoretical sphere (defined by the 98nm diameter) had been addressed the remaining 22% was filled in with lipid molecules appropriately.

How the viron surface was defined.

How the viron surface was defined.

The next task was to populate the model with the appropriate proportions of Spike (S), Membrane (M), and Envelope (E) proteins. The literature currently only offers highly variable and vague estimates as to the numbers and proportions of virion proteins. Generally, the M protein is considered the most prolific unit, followed by the Nucleocapsid (N) protein, Spike protein, and finally a small number of E proteins (3). Intresting a proteomic analysis of infected cells, concerned with novel Covid-19 diagnostic tests, suggested the N protein to be the most prolific making 90% of the SARS-CoV-2 proteome mass corresponding to around 1200 N proteins per virion (3).

In 2010 Neuman and colleagues, studying SARS-COV-1, Mouse hepatitis virus (MHV), and Feline coronavirus, suggested a maximum ratio of eight dimeric M proteins can interact with a maximum of four N proteins and one trimeric spike protein, that is 8M2:4N:1S3 (4).  However, they admit the estimated ratios in the literature then varied from 3M2:1N to 1M2:N1. Their model suggested approximately 90 Spike proteins per average virion. Because of the use of multiple different conformations of the M protein in their model, their virion model included 1100M2 units (4). Neuman estimated the average diameter of SARS-CoV-1 was around 90nm so a spherical area of 25446.9nm2. They reported the unit cell defining the occurrence of M proteins is a rhombus measuring 4nm by 4.5nm this could be alternately defined as a circle with a radius of 2.125nm  giving a spacing circle about each M protein with and area of  14.19nm2. Using the total virion surface are and the spacing circle area we get and estimated 1800 M2 proteins per virus. Alternatively, from the same paper using the maximum spacing distance of 5nm gives 1200M2 proteins per virion; clearly small differences in measurements causes larger changes in estimated M protein occurrence.

A literature review in early 2020 by Michael Bar-On and colleagues reported each virion may have as many as 2000 M proteins (unit nature, monomer or otherwise not specified), 1000 copies of the N protein, 100 Spike protein units (trimers) and 20 copies of the E protein (unit type not specified) (5).  

An electron tomography study of MHV found M proteins had a preferred spacing of around 6.5nm and an average of around 10 Spike protein units per virus (6). The MHV virion is similar in size and shape to SARS-CoV-2, with an average diameter of 85nm giving 22698.01 nm2 surface area (6). The spacing of 6.5nm together with the width of the M protein 2.8nm, gives a spacing circle about each M protein with and area of 67.93nm2; this defined by a radius of 6.5nm/2 + 2.8nm/2 = 3.25nm + 1.4nm = 4.65nm. Dividing the spacing circle area by the total surface area given an indication of how many M proteins occupy the virion surface in this case around 330 M proteins. 

More recently, a detailed analysis of Cryo-electrontomography (cryo-ET) images of SARS-CoV-2 virions has shown the number of Spike proteins to be notably less than initially thought. Yao et. al. analysed more than 2000 virions estimating 26 Spike protein units per virion estimating a likely figure of 30-35 protein units per virion (7).    

A study in Nature by Zunlong Ke et al., which informed our model early on in its inception, estimated each virion contains 24 +- 9 Spike protein trimers (1). They inspected 179 virions using cryo-electron microscopy identifying 4104 S protein trimers and 116 needle-like structures thought to be post-fusion S proteins. Using the ratios reported by Neuman in a 2006 paper on SARS COV 1, 1S3:8M2:4N to 1S3:12.5M2:4N (4) together with this estimated of 24 spike proteins, a figure of M protein dimers per virion is estimated to be around 192 to 300 units.  

The study by Ke and colleagues, also noted details about the character of the spike proteins. They identified around 97% were in the pre-fusion state with 3% presenting the Post-fusion needle-like structure (1,7). Among a smaller subset of 3854 pre-fusion Spike proteins where individual monomers could be assigned, 31% were in an all-closed conformation with 55% presenting a single open conformation (1). Around 14% presented a double open conformation.

Additionally, these more recent studies have identified flexibility in the stem of the S protein (1,8) presenting a roll and yaw of up to 60 degrees this was edited into our models at suspected pivot points between the alpha helix segments of the stem model. Distributions of the S proteins appear random, although separated enough to avoid clashing. As the surface of our virion model had an assigned address system, we could distribute the S proteins randomly using a random number generator – when Spikes were place in tiles directly adjoining each other the second S protein was moved by minus 1 tile to reflect the fact the structures disfavour close proximity

The complete virus model populated randomly with 25 S proteins.

The complete virus model populated randomly with 25 S proteins.

Taken all together we decided to include 25 S proteins in our virion model, two of which are in the post-fusion state, two are in the double-open conformation, 14 are in the single open conformation and 7 of which are in the closed conformation. The widely differing estimates of between 200-2000 M protein dimers per average virion make this aspect of the virion difficult to model with any precision. Given our model only has 480 address to place proteins, we opted populated all address with an M protein pair by default. The higher-end estimates for the number of M proteins on a single virion seems unlikely, given the proteomic expression studies suggesting the N protein is the most prolificly express at around 1200 units (3). Conversely, evidence of lattice-like formation among M protein suggests protein-protein interacts among these members. Our M protein model is around 6 nm in diameter and each M protein is place around 4-6 nm apart from the next. We suggest our final total number of 450 M protein dimers per virion is a reasonable if low estimate of the real-world virus. Finally, a number or ratio for E proteins per virion was only reported or estimated on a couple of occasions in the literature (5). As it is known the E protein occurs in only a small proportion on the mature virion, we placed 5 E protein structural units in our model virion.

Map surface location addresses with placed S and E proteins.

Map surface location addresses with placed S and E proteins.

We composed a map of the virion model's surface. This shows the 480 addresses which define the model's surface annotated with the locations of Spike and E proteins; all unannotated addresses possess an M protein dimer. Our virion model was rendered in 21 or alternatively 78 files. As the model contains over 7 million atoms, to similtanously load the entire model will required a very powerful personal computer. The files are available in both PDB and Discovery Studio file formats. Biovia's Discovery Studio (9) is one of the most powerful free molecular editors available, we recommed using this program

As the final goal of this project was to create real-world 3D printed model of the virion we used multiple programs over the course of several months to convert and assemble our model as a printable stl file.

References:

  1. Ke, Z, Oton, J, Qu, K et al. 2020, ‘Structures and distributions of SARS-CoV-2 spike proteins on intact virions.’, Nature, vol. 588, pp. 498–502, https://doi.org/10.1038/s41586-020-2665-2.

  2. Varga, Z, Flammer, AJ, Steiger, P, Haberecker, M, et al. 2020, ‘Electron microscopy of SARS-CoV-2: a challenging task – Authors' reply.’, The Lancet [Correspondence], vol. 395, no. 10238, pp. e100, DOI:https://doi.org/10.1016/S0140-6736(20)31185-5.

  3. Bezstarosti, K, Lamers, MM, Haagmans, BL, Demmers, JAA 2020, ‘Targeted Proteomics for the Detection of SARS-CoV-2 Proteins.’, [preprint], bioRxiv 2020.04.23.057810; doi: https://doi.org/10.1101/2020.04.23.057810.

  4. Neuman, BW, Adair, BD, Yoshioka, C, Quispe, JD, et al. 2006, ‘Supramolecular Architecture of Severe Acute Respiratory Syndrome Coronavirus Revealed by Electron Cryomicroscopy.’, J. Virol., vol. 80, no. 16, pp. 7918-28, doi: 10.1128/JVI.00645-06.

  5.  Bar-On, YM, Flamholz, A, Phillips, R, Milo R 2020, ‘SARS-CoV-2 (COVID-19) by the numbers.’, Elife, vol. 9, no. e57309, doi:10.7554/eLife.57309.

  6. Bárcena, M, Oostergetel, GT, Bartelink, W, Faas, FGA, et al. 2009, ‘Cryo-electron tomography of mouse hepatitis virus: Insights into the structure of the coronavirion.’, Proceedings of the National Academy of Sciences, vol. 106, no. 2, pp. 582-587, DOI: 10.1073/pnas.0805270106.

  7. Yao, H, Song, Y, Chen, Y, Wu, N, et al. 2020, ‘Molecular Architecture of the SARS-CoV-2 Virus.’, Cell, vol. 183, no. 3, pp. 730-738.e13, https://doi.org/10.1016/j.cell.2020.09.018.

  8. Turoňová, B, Sikora, M, Schürmann, C, Hagen, WJH, et al. 2020, ‘In situ structural analysis of SARS-CoV-2 spike reveals flexibility mediated by three hinges.’ Science, pp. 203-208.

  9. DassAult Systems 2021, Free Download: BIOVIA Discovery Studio Visualizer, DassAult Systems, viewed 22 April 2021, <https://discover.3ds.com/discovery-studio-visualizer-download>.