The human genome was first sequenced two decades ago by the Human Genome Project and biotech firm Celera GenomicsA genome is the complete set of an organism's genetic material, while genomics is the study of genomes, investigating their evolution, structure, and function.. Researchers have been filling in incomplete parts of this genome for the last 20 years. The most recent Human Genome sequence, which was released in 2013 and was used as a reference by the scientists, still lacks the entire sequence. The technological limitations left researchers struggling to explain how specific stretches in DNADNA, or Deoxyribonucleic Acid, is the genetic material found in cells, composed of a double helix structure. It serves as the genetic blueprint for all living organisms. fit together. About 8% of the genome was missing until now.
The scientists behind an international collaboration that comprises around 30 institutions called Telomere-to-Telomere (T2T) consortium have published a preprint titled “The complete sequence of a human genome” on May 27, 2021, complete human genome.
This new version of the human genome sequence is dubbed as “T2T-CHM13.” It introduces nearly 200 million bp of novel sequence containing 2,226 paralogous gene copies, 115 of which are protein-coding genes, to the 2013 version. The newly completed regions include all centromeric satellite arrays and the short arms of all five acrocentric chromosomes. It unlocked these complex regions of the genome to variational and functional studies for the first time.
This tremendous effort incorporated several cutting-edge technologies, including HiFi sequencing from PacBio (Pacific Biosciences in Menlo Park, California), to produce a gap-free, complete haploid human genome assembly based on a complete hydatidiform mole (CHM13). The goal was to create a novel resource with comprehensive, reliable genome data that avoids the gaps and errors that still mark the latest GRCh38 reference assembly”. The resulting T2T-CHM13 reference assembly removes a 20-year-old barrier that has hidden 8% of the genome from sequence-based analysis, including all centromeric regions and the entire short arms of five human chromosomes,” Nurk et al. report.
New to HiFi Sequencing? – Learn about this new paradigm in sequencing technology.
Instead of taking DNA from a living cell, the researchers used a cell line derived from what’s known as a complete hydatidiform mole, a type of tissue that forms in humans when a sperm inseminates an egg with no nucleus. The resulting cell has only one set of chromosomes, which is the father’s. This eliminates the need to distinguish between the two sets of chromosomes for different people.
The team estimate that bout 3% of the genome might contain errors, as they had trouble resolving a few regions on the chromosome. Though there are no gaps, quality-control checks have proved challenging in those areas, according to genomics researcher Karen Miga at the University of California. And the sperm cell that formed the hydatidiform mole carried an X chromosome, so the researchers have not yet sequenced a Y chromosome, which typically triggers male biological development.
Source: Nature 594, 158-159 (2021)doi: https://doi.org/10.1038/d41586-021-01506-w
Publication reference: Nurk, S. et al. Preprint at bioRxiv https://doi.org/10.1101/2021.05.26.445798 (2021).
Forensic Analyst by Profession. With Simplyforensic.com striving to provide a one-stop-all-in-one platform with accessible, reliable, and media-rich content related to forensic science. Education background in B.Sc.Biotechnology and Master of Science in forensic science.