Scientists have pieced together a new draft of the human genome that better captures humanity’s genetic diversity.
The new “pangenome” incorporates the DNA of 47 individuals from every continent except Antarctica and Oceania. The scientists involved say it will improve our ability to diagnose disease, discover new drugs and understand the genetic variants that lead to ill health or a particular physical trait.
Until now, geneticists have used a single human genome, largely based on one individual, as a standard reference map for the detection of genetic changes that cause disease. This has likely missed some of the genetic diversity between individuals and different populations around the world.
“This pangenome reference represents an incredible scientific achievement, providing an expanding view of humanity’s DNA blueprint, with significantly greater human diversity than previous reference sequences,” Eric Green, the director of the US National Human Genome Research Institute, which funded the project, told a news briefing.
“Having a high quality human pangenome reference that increasingly reflects the diversity of the human population will enable scientists and healthcare professionals to better understand how genomic variants influence health and disease, and move us towards a future in which genomic medicine benefits everyone,” Green said.
The pangenome, a digital amalgamation of sequences that can be used to compare, construct and study other human genome sequences, is still a draft. Researchers hope to include 350 people by the middle of 2024. The scientific milestone was detailed in several papers published Wednesday in Nature and its partner journals.
“You can imagine this as a new road map to drop off your kids at school. While you only take the same road every day, your neighbour might take a slightly different road on a side street,” said Benjamin Schwessinger, an associate professor at The Australian National University who wasn’t involved in the project.
“This new approach of a ‘pan-genome’ maps out these alternative routes that make us humans so distinct from each other,” Schwessinger said in a statement.
While each person’s genome varies only slightly — by about 0.4% compared to the next person on average — the human genome is massive, consisting of 3.2 billion pairs end to end. This means that there’s still many important genetic differences between individuals and populations around the world.
The four building blocks of DNA — adenine (A), cytosine (C), guanine (G) and thymine (T) — form specific pairs and the binding of these base pairs forms the structure of DNA
Genomic variation can be small, consisting of differences of just one or a few DNA bases, or it can be large structural variants, that are 50 base pairs or larger. These larger, structural variants can have important health implications, such as in the functioning of the immune system.
The new reference incorporates more diverse genetic sequences and adds 119 million base pairs to the library of 3.2 billion previously known base pairs that make up the human genome, deepening our understanding of human genetic diversity and making it more complete.
“The pangenome reveals the architecture of variation and how it affects genes and will have a major impact on genomics research,” said Benedict Paten, an associate professor of biomolecular engineering at the University of California, Santa Cruz, and associate director of the UC Santa Cruz Genomics Institute.
“It also reveals new biology … we’re getting a better picture for how some of the most complex regions of the genome vary. Until now, the composition of these fast evolving regions has largely been invisible to us,” Paten said.
The first draft of the human genome was released in 2001 and was only fully completed in 2022. It has been an invaluable tool for researchers, launching a new era for scientific discovery, technological innovation and genomic medicine, said Karen Miga, assistant professor in the biomolecular engineering department at the University of California, Santa Cruz and associate director of the UC Santa Cruz Genomics Institute.
“Understanding and cataloging these differences between genomes will allow us to understand how cells operate, their biology and how they function, as well as understanding genetic differences and how they contribute to understanding human disease,” Miga said.
The original human reference genome was predominantly based on anonymous volunteers who responded to an ad placed in the Buffalo Evening News in March 23 1997, with one donor accounting for 70% of the sequence, according to the NHGRI.
The 47 anonymous individuals included in the first draft of the pangenome project were among those who participated in the 1000 Genomes Project, a catalog of common human genetic variation that was completed in 2015. The team is in the process of recruiting new individuals to represent some populations not included in the 1000 Genomes Project, particularly people of Middle Eastern and African ancestry.
Other projects aimed at broadening genomic databases have “often missed the mark in demonstrating respect” for communities in lower-income countries and indigenous people, who say their samples and data are being used to further the goals of scientists and institutions in rich countries, a Nature paper published last year on the pangenome project noted.
The team was keen to avoid any similar mistakes. Barbara Koenig, professor emeritus of medical anthropology and bioethics at the University of California, San Francisco, said ethical considerations and “the principle of justice” were a key part of the endeavor.