New Evidence for SARS-CoV-2 Lab Origins

Endonuclease fingerprint indicates a synthetic origin of SARS-CoV-2, a preprint by Valentin Bruttel, Alex Washburne and Antonius VanDongen, is probably the most important research to date on the artificial origins of SARS-2. There’s a Twitter summary by Washburne here, and a reader-friendly Substack writeup by Washburne here.

As Ralph Baric and his colleagues explained way back in 2017, coronaviruses have very long genomes, which “complicate efficient engineering.” If you want to tinker with their genes, you need to create a DNA copy, because DNA is more stable and easier to fiddle with. To do this, Baric recommended using specific restriction enzymes, which create DNA analogues of sections of the RNA genome, demarcated by specific genetic sequences called recognition sites. These shorter DNA building-block sequences can then be individually manipulated or switched out, and then ultimately joined to each other at their “sticky ends” to build the full DNA genome, which can then be transcribed back into the original RNA. Unless you specifically decide to remove them, though, the restriction sites remain in the RNA sequence, like seams in cloth. Naturally occurring coronaviruses also have restriction sites, but these tend to have a more sporadic occurrence. Scientists like Baric therefore preferred to remove some of these natural restriction sites and to add others of their own, so they’d only have to work with a few DNA fragments of roughly equal length. They might also prefer to insert seams around genetic sequences of particular interest. These would make it possible to swap in the genetic code at these crucial sites – to see, for example, how pathogenic a wild-type SARS-2 with Omicron spike might be.

Comparison against a wide variety of other virus genomes reveals that the distribution of these seams, or recognition sites, in SARS-2 is highly anomalous for a natural virus, and totally normal for a synthetic one. That is, SARS-2 has all of its restriction sites exactly where a scientist would find it most convenient to put them, and not where you’d expect them to occur naturally.

Note the evenly spaced BsaI (pink() and BsmBI (green) restriction sites in the SARS-2 genome, as opposed to the more random occurrence in natural coronaviruses.

There’s little point in me summarising the authors’ conclusions any further – Washburne in particular has done a great job of making his technical findings accessible to a wide audience, and I recommend that you read his Twitter and his Substack post.

Here, I just want to emphasise two points. The first is the fact that the crucial receptor binding domain and furin cleavage site of the SARS-2 genome are bracketed by two BsaI restriction sites. In the closest SARS-2 relatives, this portion of the genome is further divided by other BsaI and BsmBI restriction sites. It looks like someone has removed them here to create this clean and undivided sequence. This could be read as circumstantial evidence that SARS-2 is the product of a research programme focused at least partly on tinkering with this precise part of the genome.

The other point is the likely deliberate decision to leave the restriction sites in place, even though removing them from the final viral product would have been trivial. This suggests that the SARS-2 was the subject of ongoing gain-of-function experiments at the moment it escaped, with all its Frankenstein stitches still showing, to facilitate further cutting and pasting operations.