Chinese Covid-19 Gene Data Purged From NIH Database Triggers Concerns Amid Debate Over Virus Origins

The Chinese city of Wuhan has found itself at the focus of spiraling conspiracy theories that the COVID-19 virus leaked from its virology lab and that Beijing authorities had allegedly been hiding the outbreak of the novel respiratory disease for as long as possible – something China has consistently denied.

The US National Institutes of Health erased gene sequences of early COVID-19 cases from its digital archive last year at the request of a Chinese researcher who had submitted the information in March 2020, according to the Wall Street Journal.

Chinese researchers sought to remove the data in June last year because updated information was to be posted to another unspecified database, writes the outlet.

In a Wednesday statement, the NIH confirmed that it deleted the sequences from the database, known as the SRA, as submitting investigators “hold the rights to their data and can request withdrawal of the data.”

“These SARS-CoV-2 sequences were submitted for posting in SRA in March 2020 and subsequently requested to be withdrawn by the submitting investigator in June 2020. The requestor indicated the sequence information had been updated, was being submitted to another database, and wanted the data removed from SRA to avoid version control issues,” NIH said.The US medical research agency added that amid the flurry of theories circulating regarding the origins of the coronavirus, it “can’t speculate on motive beyond a submitter’s stated intentions.” The statement did not identify the scientist who made the request.

‘Obscured Sequences’

The scrubbing of the sequencing data is laid out in a new paper, yet to be peer reviewed, posted on preprint server bioRxiv on 22 June by Jesse Bloom, a virologist at the Fred Hutchinson Cancer Research Center in Seattle.

The paper says the missing data, retrieved by Bloom through Google Cloud, included sequences from virus samples collected in the Chinese city of Wuhan in January and February of 2020 from patients hospitalized with or suspected of having Covid-19.

The preprint says “the current study suggests that at least in one case, the trusting structures of science have been abused to obscure sequences relevant to the early spread of SARS-CoV-2 in Wuhan.”

Furthermore, processed forms of the same data had been included in a pre-print paper from Chinese scientists posted in March 2020 and later, upon peer review, published in the journal Small in June.

Bloom, who acknowledged it was a “hot button topic”, was reported by The Post as emphasising he was not accusing NIH of any wrongdoing. The researcher also went on Twitter to highlight the fact that the data was also taken down from a Chinese database.

“Certainly, the consequence of removing the sequences was to obscure their existence,” Bloom was cited by The Post as saying.

The removal of the sequences yielded “a somewhat skewed picture of viruses circulating in Wuhan early on… It suggests possibly one reason why we haven’t seen more of these sequences is perhaps there hasn’t been a wholehearted effort to get them out there,” Dr. Bloom was cited as saying.

Raging COVID-19 Origins Debate

The World Health Organisation (WHO) in late March issued a report on its mission to Wuhan, which also visited the now-famous lab, and deemed the possibility of the virus escaping from the lab as “extremely unlikely.”

However, Bloom co-authored a letter published in May in the journal Science that criticised the WHO report.
The scientist called for a more thorough probe into two leading hypotheses of the origin of COVID-19. The first claims the SARS-CoV-2 virus entered the human population after escaping from a lab. The second, which has been gaining ever more traction of late, is that it jumped to humans naturally from infected animals, such as bats.

Bloom was cited as saying he realized that sequences had been scrubbed from NIH’s Sequence Read Archive database after reading an analysis by other investigators. In an attempt to locate the sequences himself, he said he spent hours scouring the internet for sources of the deleted sequences.

After he obtained and downloaded them, he said he contacted the NIH to query why they had been erased.
Dr. Bloom vowed he would “go through every early pre-print I can find about SARS-CoV-2 and see if it describes any data that isn’t in the databases.”

Amid the reports, concerns have been raised that scientists studying the origin of the coronavirus pandemic may have been denied crucial pieces of information.

“It makes us wonder if there are other sequences like these that have been purged,” Vaughn S. Cooper, a University of Pittsburgh evolutionary biologist, was quoted as saying.A WHO official who collaborated with the international team that produced the March report on the origins of the virus said Bloom’s paper argued the case for more analysis of earliest likely COVID-19 infections.

There has not been any comment on the report from either the Chinese researchers who initially submitted the sequences to the NIH database in March 2020 or China’s National Health Commission.

COVID-19 was reportedly initially detected at the wet market in Wuhan, China in December 2019. By 11 March 2020 the WHO declared COVID-19 a pandemic.

Last month, US President Joe Biden ordered intelligence agencies to prepare a report in 90 days that would put an end to the raging debate over whether the coronavirus spread to humans from animals naturally or whether it had resulted from a lab spill.
The latter claims, particularly espoused by ex-President Donald Trump, have been persistently denied by China.

Biden’s order followed a report in a Wall Street Journal article that several staff members of the Wuhan Institute of Virology sought hospital treatment in late 2019 with symptoms consistent with those observed in COVID-19 patients. The reports, which potentially upended the timeline of the pandemic, have been denies by the Chinese government.

by Svetlana Ekimenko Via