«

Complete human genome T2T-CHM13v2.0 available in Galaxy


»

While the first version of an improved human genome resulting from the Telomere-to-Telomere (T2T) Consortium efforts got reported in 2020, it has taken two years of additional work to achieve something that the scientific community has been eagerly awaiting since: the first complete, gap-free sequence for all 24 human chromosomes (including Y).


Graphical Abstract

Resolved sequences by the T2T-CHM13v2.0 reference genome. Resource: T2T consortium


The T2T-CHM13v2.0 reference genome, generated primarily by long-read sequencing, provides a significant improvement in the characterization of centromeric satellite repeats, transposable elements, and segmental duplications.

Compared to the latest version, GRCh38, of the human reference genome, it adds nearly 200 million base pairs of novel DNA sequences, revealing 2,880 genes with no assigned GRCh38 orthologs, and nearly 2,000 candidate new genes. The filled gaps include the entire short arms of five human chromosomes and cover some of the most complex regions of the genome. In addition, the new version corrects thousands of structural errors and includes, for the first time, the complete sequence of the Y chromosome.

This genome is now available as built-in indexed genome in the wide collection of mapping tools that Galaxy puts at your disposal (e.g. RNASTAR, HISAT2, BWA-MEM). Enjoy your research!


References:

[1] Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A. V., Mikheenko, A. & Phillippy, A. M. (2022). The complete sequence of a human genome. Science, 376(6588), 44-53.

[2] Zahn, L. (2022). Filling the gaps. Science. 376(6588), 42-43.

[3] Gershman, A., Sauria, M. E., Hook, P. W., Hoyt, S. J., Razaghi, R., Koren, S., … & Timp, W. (2021). Epigenetic patterns in a complete human genome. Science. 376(6588).