Semi-automated assembly of high-quality diploid human reference genomes DOI Creative Commons
Erich D. Jarvis, Giulio Formenti, Arang Rhie

и другие.

Nature, Год журнала: 2022, Номер 611(7936), С. 519 - 531

Опубликована: Окт. 19, 2022

Abstract The current human reference genome, GRCh38, represents over 20 years of effort to generate a high-quality assembly, which has benefitted society 1,2 . However, it still many gaps and errors, does not represent biological genome as is blend multiple individuals 3,4 Recently, telomere-to-telomere reference, CHM13, was generated with the latest long-read technologies, but derived from hydatidiform mole cell line nearly homozygous 5 To address these limitations, Human Pangenome Reference Consortium formed goal creating high-quality, cost-effective, diploid assemblies for pangenome that genetic diversity 6 Here, in our first scientific report, we determined combination sequencing assembly approaches yield most complete accurate minimal manual curation. Approaches used highly long reads parent–child data graph-based haplotype phasing during outperformed those did not. Developing top-performing methods, containing only approximately four per chromosome on average, chromosomes within ±1% length CHM13. Nearly 48% protein-coding genes have non-synonymous amino acid changes between haplotypes, centromeric regions showed highest diversity. Our findings serve foundation assembling near-complete genomes at scale capture global variation single nucleotides structural rearrangements.

Язык: Английский

Segmental duplications and their variation in a complete human genome DOI
Mitchell R. Vollger, Xavi Guitart, Philip C. Dishuck

и другие.

Science, Год журнала: 2022, Номер 376(6588)

Опубликована: Март 31, 2022

Despite their importance in disease and evolution, highly identical segmental duplications (SDs) are among the last regions of human reference genome (GRCh38) to be fully sequenced. Using a complete telomere-to-telomere (T2T-CHM13), we present comprehensive view SD organization. SDs account for nearly one-third additional sequence, increasing genome-wide estimate from 5.4 7.0% [218 million base pairs (Mbp)]. An analysis 268 genomes shows that 91% previously unresolved T2T-CHM13 sequence (68.3 Mbp) better represents copy number variation. Comparing long-read assemblies (

Язык: Английский

Процитировано

251

Retrospective detection of asymptomatic monkeypox virus infections among male sexual health clinic attendees in Belgium DOI Creative Commons
Irith De Baetselier, Christophe Van Dijck, Chris Kenyon

и другие.

Nature Medicine, Год журнала: 2022, Номер 28(11), С. 2288 - 2292

Опубликована: Авг. 12, 2022

The magnitude of the 2022 multi-country monkeypox virus (MPXV) outbreak has surpassed any preceding outbreak. It is unclear whether asymptomatic or otherwise undiagnosed infections are fuelling this epidemic. In study, we aimed to assess occurred among men attending a Belgian sexual health clinic in May 2022. We retrospectively screened 224 samples collected for gonorrhea and chlamydia testing using an MPXV PCR assay identified MPXV-DNA-positive from four men. At time sampling, one man had painful rash, three reported no symptoms. Upon clinical examination 21-37 days later, these were free signs, they not having experienced Serology confirmed exposure all men, was cultured two cases. These findings show that certain cases remain suggest quarantining individuals reporting symptoms may suffice contain

Язык: Английский

Процитировано

240

The complete sequence of a human Y chromosome DOI
Arang Rhie, Sergey Nurk, Monika Čechová

и другие.

Nature, Год журнала: 2023, Номер 621(7978), С. 344 - 354

Опубликована: Авг. 23, 2023

Язык: Английский

Процитировано

238

Telomere-to-telomere assembly of diploid chromosomes with Verkko DOI
Mikko Rautiainen, Sergey Nurk, Brian P. Walenz

и другие.

Nature Biotechnology, Год журнала: 2023, Номер 41(10), С. 1474 - 1482

Опубликована: Фев. 16, 2023

Язык: Английский

Процитировано

215

Method of the year: long-read sequencing DOI Open Access

Vivien Marx

Nature Methods, Год журнала: 2023, Номер 20(1), С. 6 - 11

Опубликована: Янв. 1, 2023

Язык: Английский

Процитировано

205

GENESPACE tracks regions of interest and gene copy number variation across multiple genomes DOI Creative Commons
John T. Lovell, Avinash Sreedasyam, M. Eric Schranz

и другие.

eLife, Год журнала: 2022, Номер 11

Опубликована: Сен. 9, 2022

The development of multiple chromosome-scale reference genome sequences in many taxonomic groups has yielded a high-resolution view the patterns and processes molecular evolution. Nonetheless, leveraging information across genomes remains significant challenge nearly all eukaryotic systems. These challenges range from studying evolution chromosome structure, to finding candidate genes for quantitative trait loci, testing hypotheses about speciation adaptation. Here, we present GENESPACE, which addresses these by integrating conserved gene order orthology define expected physical position genomes. We demonstrate this utility dissecting presence-absence, copy-number, structural variation at three levels biological organization: spanning 300 million years vertebrate sex evolution, diversity Poaceae (grass) plant family, among 26 maize cultivars. methods build visualize syntenic GENESPACE R package offer addition existing family synteny programs, especially polyploid, outbred, other complex genomes.The is complete DNA sequence an individual. It crucial foundation studies medicine, agriculture, conservation biology. Advances genetics have made it possible rapidly sequence, or read out, organisms. For closely related species, scientists can then do detailed comparisons, revealing similar with shared past common role, but comparing more distantly organisms difficult. One major that are often lost duplicated over evolutionary time. way be confident look ‘synteny’, how organized ordered within genome. In some persists millions Combining similarity could make comparisons between species robust. To this, Lovell et al. developed software links similarities This allows researchers explore determine whether been duplicated. value explored vertebrates flowering plants. was able highlight unique chromosomes birds mammals, track positions important grass crops including maize, wheat, rice. Exploring genetic code lead better understanding sections might also allow find target applications like crop improvement. designed easy use, allowing them graphics perform analyses few programming skills.

Язык: Английский

Процитировано

196

Highly contiguous assemblies of 101 drosophilid genomes DOI Creative Commons
Bernard Kim, Jeremy Wang, Danny E. Miller

и другие.

eLife, Год журнала: 2021, Номер 10

Опубликована: Июль 19, 2021

Over 100 years of studies in Drosophila melanogaster and related species the genus have facilitated key discoveries genetics, genomics, evolution. While high-quality genome assemblies exist for several this group, they only encompass a small fraction genus. Recent advances long-read sequencing allow tens or even hundreds to be efficiently generated. Here, we utilize Oxford Nanopore build an open community resource 101 lines 93 drosophilid encompassing 14 groups 35 sub-groups. The genomes are highly contiguous complete, with average contig N50 10.5 Mb greater than 97% BUSCO completeness 97/101 assemblies. We show that Nanopore-based accurate coding regions, particularly respect insertions deletions. These assemblies, along detailed laboratory protocol assembly pipelines, released as public will serve starting point addressing broad questions ecology, evolution at scale species.

Язык: Английский

Процитировано

170

plotsr: visualizing structural similarities and rearrangements between multiple genomes DOI Creative Commons
Manish Goel, Korbinian Schneeberger

Bioinformatics, Год журнала: 2022, Номер 38(10), С. 2922 - 2926

Опубликована: Апрель 14, 2022

Third-generation genome sequencing technologies have led to a sharp increase in the number of high-quality assemblies. This allows comparison multiple assembled genomes individual species and demands new tools for visualising their structural properties. Here we present plotsr, an efficient tool visualize similarities rearrangements between genomes. It can be used compare on chromosome level or zoom any selected region. In addition, plotsr augment visualisation with regional identifiers (e.g. genes genomic markers) histogram tracks continuous features GC content polymorphism density).plotsr is implemented as python package uses standard matplotlib library plotting. freely available under MIT license at GitHub (https://github.com/schneebergerlab/plotsr) bioconda (https://anaconda.org/bioconda/plotsr).Supplementary data are Bioinformatics online.

Язык: Английский

Процитировано

169

Long-read mapping to repetitive reference sequences using Winnowmap2 DOI
Chirag Jain, Arang Rhie, Nancy F. Hansen

и другие.

Nature Methods, Год журнала: 2022, Номер 19(6), С. 705 - 710

Опубликована: Апрель 1, 2022

Язык: Английский

Процитировано

165

NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads DOI Creative Commons
Jiang Hu, Zhuo Wang, Zongyi Sun

и другие.

Genome biology, Год журнала: 2024, Номер 25(1)

Опубликована: Апрель 26, 2024

Long-read sequencing data, particularly those derived from the Oxford Nanopore platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient correction and assembly tool for noisy long reads, which achieves a level of accuracy in genome assembly. We apply NextDenovo assemble 35 diverse human genomes around world using long-read data. These allow us identify landscape segmental duplication gene copy number variation modern populations. The use should pave way population-scale

Язык: Английский

Процитировано

159