Telomere-to-telomere genome assembly of bitter melon (Momordica charantia L. var. abbreviata Ser.) reveals fruit development, composition and ripening genetic characteristics DOI Creative Commons

Anzhen Fu,

Yanyan Zheng, Jing Guo

et al.

Horticulture Research, Journal Year: 2022, Volume and Issue: 10(1)

Published: Oct. 11, 2022

Momordica charantia L. var. abbreviata Ser. (Mca), known as bitter gourd or melon, is a variety with medicinal value and belongs to the Cucurbitaceae family. In view of lack genomic information on other species promote Mca research, we assembled 295.6-Mb telomere-to-telomere (T2T) high-quality genome six gap-free chromosomes after Hi-C correction. This anchored 11 chromosomes, which consistent karyotype information, comprises 98 contigs (N50 25.4 Mb) 95 scaffolds Mb). The harbors 19 895 protein-coding genes, 45.59% constitute predicted repeat sequences. Synteny analysis revealed variations involved in fruit quality during divergence gourd. addition, assay for transposase-accessible chromatin by high-throughput sequencing metabolic showed that momordicosides substances are characteristic pulp. A combined transcriptomic metabolomic mechanisms pigment accumulation cucurbitacin biosynthesis peels, providing fundamental molecular further research ripening. report provides new genetic resource studies contributes additional insights into phylogeny.

Language: Английский

NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads DOI Creative Commons
Jiang Hu, Zhuo Wang, Zongyi Sun

et al.

Genome biology, Journal Year: 2024, Volume and Issue: 25(1)

Published: April 26, 2024

Long-read sequencing data, particularly those derived from the Oxford Nanopore platform, tend to exhibit high error rates. Here, we present NextDenovo, an efficient correction and assembly tool for noisy long reads, which achieves a level of accuracy in genome assembly. We apply NextDenovo assemble 35 diverse human genomes around world using long-read data. These allow us identify landscape segmental duplication gene copy number variation modern populations. The use should pave way population-scale

Language: Английский

Citations

148

A complete telomere-to-telomere assembly of the maize genome DOI Creative Commons
Jian Chen,

Zijian Wang,

Kaiwen Tan

et al.

Nature Genetics, Journal Year: 2023, Volume and Issue: 55(7), P. 1221 - 1231

Published: June 15, 2023

Abstract A complete telomere-to-telomere (T2T) finished genome has been the long pursuit of genomic research. Through generating deep coverage ultralong Oxford Nanopore Technology (ONT) and PacBio HiFi reads, we report here a assembly maize with each chromosome entirely traversed in single contig. The 2,178.6 Mb T2T Mo17 base accuracy over 99.99% unveiled structural features all repetitive regions genome. There were several super-long simple-sequence-repeat arrays having consecutive thymine–adenine–guanine (TAG) tri-nucleotide repeats up to 235 kb. entire nucleolar organizer region 26.8 array 2,974 45S rDNA copies revealed enormously complex patterns duplications transposon insertions. Additionally, assemblies ten centromeres enabled us precisely dissect repeat compositions both CentC-rich CentC-poor centromeres. represents major step forward understanding complexity highly recalcitrant higher plant genomes.

Language: Английский

Citations

146

The complete reference genome for grapevine (Vitis vinifera L.) genetics and breeding DOI Creative Commons

Xiaoya Shi,

Shuo Cao, Xu Wang

et al.

Horticulture Research, Journal Year: 2023, Volume and Issue: 10(5)

Published: April 4, 2023

Grapevine is one of the most economically important crops worldwide. However, previous versions grapevine reference genome tipically consist thousands fragments with missing centromeres and telomeres, limiting accessibility repetitive sequences, centromeric telomeric regions, study inheritance agronomic traits in these regions. Here, we assembled a telomere-to-telomere (T2T) gap-free for cultivar PN40024 using PacBio HiFi long reads. The T2T (PN_T2T) 69 Mb longer 9018 more genes identified than 12X.v0 version. We annotated 67% 19 36 incorporated gene annotations into PN_T2T assembly. detected total 377 clusters, which showed associations complex traits, such as aroma disease resistance. Even though derives from nine generations selfing, still found genomic hotspots heterozygous sites associated biological processes, oxidation-reduction process protein phosphorylation. fully complete therefore constitutes an resource genetic studies breeding programs.

Language: Английский

Citations

116

An efficient error correction and accurate assembly tool for noisy long reads DOI Creative Commons
Jiang Hu, Zhuo Wang, Zongyi Sun

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: March 12, 2023

Abstract Long read sequencing data, particularly those derived from the Oxford Nanopore (ONT) platform, tend to exhibit a high error rate. Here, we present NextDenovo, highly efficient correction and assembly tool for noisy long reads, which achieves level of accuracy in genome assembly. NextDenovo can rapidly correct reads; these corrected reads contain fewer errors than other comparable tools are characterized by chimeric alignments. We applied quality reference genomes 35 diverse humans across world using ONT data. Based on de novo assemblies, were able identify landscape segmental duplications gene copy number variation modern human population. The use program should pave way population-scale long-read assembly, thereby facilitating construction pan-genomes,

Language: Английский

Citations

103

A near‐complete genome assembly of Brassica rapa provides new insights into the evolution of centromeres DOI Creative Commons
Lei Zhang,

Jianli Liang,

Haixu Chen

et al.

Plant Biotechnology Journal, Journal Year: 2023, Volume and Issue: 21(5), P. 1022 - 1032

Published: Jan. 23, 2023

Summary Brassica rapa comprises many important cultivated vegetables and oil crops. However, Chiifu v3.0, the current B. reference genome, still contains hundreds of gaps. Here, we presented a near‐complete genome assembly v4.0, which was 424.59 Mb with only two gaps, using Oxford Nanopore Technology (ONT) ultralong‐read sequencing Hi‐C technologies. The new 12 contigs, contig N50 38.26 Mb. Eight ten chromosomes were entirely reconstructed in single from telomere to telomere. We found that centromeres mainly invaded by ALE CRM long terminal repeats (LTRs). Moreover, there is high divergence centromere length sequence among genomes. further are enriched for Copia at 0.14 MYA on average, while pericentromeres Gypsy LTRs 0.51 average. These results indicated different invasion mechanisms between structures. In addition, novel repetitive PCR630 identified . Overall, assembly, offers valuable tools genomic genetic studies species provides insights into evolution centromeres.

Language: Английский

Citations

54

The telomere-to-telomere genome of Fragaria vesca reveals the genomic evolution of Fragaria and the origin of cultivated octoploid strawberry DOI Creative Commons
Yuhan Zhou, Jinsong Xiong,

Ziqiang Shu

et al.

Horticulture Research, Journal Year: 2023, Volume and Issue: 10(4)

Published: Feb. 20, 2023

Fragaria vesca, commonly known as wild or woodland strawberry, is the most widely distributed diploid species and native to Europe Asia. Because of its small plant size, low heterozygosity, relative ease genetic transformation, F. vesca has been a model for fruit research since publication Illumina-based genome in 2011. However, genomic contribution octoploid cultivated strawberry remains long-standing question. Here, we de novo assembled annotated telomere-to-telomere, gap-free 'Hawaii 4', with all seven chromosomes into single contigs, providing highest completeness assembly quality date. The 220 785 082 bp length encodes 36 173 protein-coding gene models, including 1153 newly genes. All 14 telomeres centromeres were within chromosomes. Among three previously recognized ancestors, iinumae, viridis, phylogenomic analysis showed that viridis are ancestors × ananassa, closest relative. Three subgenomes ananassa belong group, one sister viridis. We anticipate this high-quality, genome, combined our inference origin will provide insight evolution facilitate genetics molecular breeding.

Language: Английский

Citations

52

Near telomere-to-telomere genome of the model plant Physcomitrium patens DOI
Guiqi Bi,

Shijun Zhao,

Jiawei Yao

et al.

Nature Plants, Journal Year: 2024, Volume and Issue: 10(2), P. 327 - 343

Published: Jan. 26, 2024

Language: Английский

Citations

36

Cepharanthine analogs mining and genomes of Stephania accelerate anti-coronavirus drug discovery DOI Creative Commons
Liang Leng, Zhichao Xu,

Bixia Hong

et al.

Nature Communications, Journal Year: 2024, Volume and Issue: 15(1)

Published: Feb. 20, 2024

Abstract Cepharanthine is a secondary metabolite isolated from Stephania . It has been reported that it anti-conronaviruses activities including severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Here, we assemble three genomes ( S. japonica , yunnanensis and cepharantha ), propose the cepharanthine biosynthetic pathway, assess antiviral potential of compounds involved in pathway. Among genomes, near telomere-to-telomere assembly with one remaining gap, have chromosome-level assemblies. Following by gene mining metabolomics analysis, identify seven analogs broad-spectrum anti-coronavirus activities, SARS-CoV-2, Guangxi pangolin-CoV (GX_P2V), swine diarrhoea coronavirus (SADS-CoV), porcine epidemic diarrhea virus (PEDV). We also show two other genera, Nelumbo Thalictrum can produce analogs, thus for compound discovery. Results generated this study could accelerate drug

Language: Английский

Citations

35

Technology-enabled great leap in deciphering plant genomes DOI
Lingjuan Xie, Xiaojiao Gong, Kun Yang

et al.

Nature Plants, Journal Year: 2024, Volume and Issue: 10(4), P. 551 - 566

Published: March 20, 2024

Language: Английский

Citations

33

NextPolish2: A Repeat-aware Polishing Tool for Genomes Assembled Using HiFi Long Reads DOI Creative Commons
Jiang Hu, Zhuo Wang, Fan Liang

et al.

Genomics Proteomics & Bioinformatics, Journal Year: 2024, Volume and Issue: 22(1)

Published: Jan. 4, 2024

Abstract The high-fidelity (HiFi) long-read sequencing technology developed by PacBio has greatly improved the base-level accuracy of genome assemblies. However, these assemblies still contain errors, particularly within error-prone regions HiFi long reads. Existing polishing tools usually introduce overcorrections and haplotype switch errors when correcting in genomes assembled from Here, we describe an upgraded tool — NextPolish2, which can fix base remaining those “highly accurate” reads without introducing excessive errors. We believe that NextPolish2 a great significance to further improve telomere-to-telomere (T2T) genomes. is freely available at https://github.com/Nextomics/NextPolish2.

Language: Английский

Citations

22