Journal of genetics and genomics/Journal of Genetics and Genomics, Journal Year: 2022, Volume and Issue: 49(5), P. 385 - 393
Published: March 8, 2022
Language: Английский
Journal of genetics and genomics/Journal of Genetics and Genomics, Journal Year: 2022, Volume and Issue: 49(5), P. 385 - 393
Published: March 8, 2022
Language: Английский
Trends in Plant Science, Journal Year: 2021, Volume and Issue: 27(4), P. 391 - 401
Published: Nov. 12, 2021
Language: Английский
Citations
212Nature Genetics, Journal Year: 2023, Volume and Issue: 55(7), P. 1221 - 1231
Published: June 15, 2023
Abstract A complete telomere-to-telomere (T2T) finished genome has been the long pursuit of genomic research. Through generating deep coverage ultralong Oxford Nanopore Technology (ONT) and PacBio HiFi reads, we report here a assembly maize with each chromosome entirely traversed in single contig. The 2,178.6 Mb T2T Mo17 base accuracy over 99.99% unveiled structural features all repetitive regions genome. There were several super-long simple-sequence-repeat arrays having consecutive thymine–adenine–guanine (TAG) tri-nucleotide repeats up to 235 kb. entire nucleolar organizer region 26.8 array 2,974 45S rDNA copies revealed enormously complex patterns duplications transposon insertions. Additionally, assemblies ten centromeres enabled us precisely dissect repeat compositions both CentC-rich CentC-poor centromeres. represents major step forward understanding complexity highly recalcitrant higher plant genomes.
Language: Английский
Citations
146Genomics Proteomics & Bioinformatics, Journal Year: 2021, Volume and Issue: 20(1), P. 4 - 13
Published: Sept. 3, 2021
Abstract Arabidopsis thaliana is an important and long-established model species for plant molecular biology, genetics, epigenetics, genomics. However, the latest version of reference genome still contains a significant number missing segments. Here, we reported high-quality almost complete Col-0 assembly with two gaps (named Col-XJTU) by combining Oxford Nanopore Technologies ultra-long reads, Pacific Biosciences high-fidelity long Hi-C data. The total size 133,725,193 bp, introducing 14.6 Mb novel sequences compared to TAIR10.1 genome. All five chromosomes Col-XJTU are highly accurate consensus quality (QV) scores > 60 (ranging from 62 68), which higher than those 45 52). We completely resolved chromosome (Chr) 3 Chr5 in telomere-to-telomere manner. Chr4 was except nucleolar organizing regions, comprise repetitive DNA fragments. Chr1 centromere (CEN1), reportedly around 9 length, particularly challenging assemble due presence tens thousands CEN180 satellite repeats. Using cutting-edge sequencing data computational approaches, assembled 3.8-Mb-long CEN1 3.5-Mb-long CEN2. also investigated structure epigenetics centromeres. Four clusters monomers were detected, centromere-specific histone H3-like protein (CENH3) exhibited strong preference Cluster 3. Moreover, observed hypomethylation patterns CENH3-enriched regions. believe that this assembly, Col-XJTU, would serve as valuable better understand global pattern centromeric polymorphisms, well genetic epigenetic features plants.
Language: Английский
Citations
134Molecular Plant, Journal Year: 2022, Volume and Issue: 15(8), P. 1268 - 1284
Published: June 23, 2022
Watermelon, Citrullus lanatus, is the world's third largest fruit crop. Reference genomes with gaps and a narrow genetic base hinder functional genomics improvement of watermelon. Here, we report assembly telomere-to-telomere gap-free genome elite watermelon inbred line G42 by incorporating high-coverage accurate long-read sequencing data multiple strategies. All 11 chromosomes have been assembled into single-contig pseudomolecules without gaps, representing highest completeness quality to date. The reference 369 321 829 bp in length contains 24 205 predicted protein-coding genes, all 22 telomeres centromeres characterized. Furthermore, established pollen-EMS mutagenesis protocol obtained over 200 000 M1 seeds from . In sampling pool, 48 monogenic phenotypic mutations, selected 223 78 M2 mutants morphological changes, were confirmed. average mutation density was 1 SNP/1.69 Mb indel/4.55 per plant SNP/1.08 indel/6.25 plant. Taking advantage genome, 8039 mutations 32 plants sampled families identified 100% accuracy, whereas only 25% randomly using 97103v2 could be Using this library two genes responsible for elongated shape male sterility (ClMS1) identified, both caused single change G A. validated its EMS provide invaluable resources
Language: Английский
Citations
123Horticulture Research, Journal Year: 2023, Volume and Issue: 10(8)
Published: June 13, 2023
A high-quality genome is the basis for studies on functional, evolutionary, and comparative genomics. The majority of attention has been paid to solution complex chromosome structures highly repetitive sequences, along with emergence a new 'telomere-to-telomere (T2T) assembly' era. However, bioinformatic tools automatic construction and/or characterization T2T are limited. Here, we developed user-friendly web toolkit, quarTeT, which currently includes four modules: AssemblyMapper, GapFiller, TeloExplorer, CentroMiner. First, AssemblyMapper designed assemble phased contigs into chromosome-level by referring closely related genome. Then, GapFiller would endeavor fill all unclosed gaps in given aid additional ultra-long sequences. Finally, TeloExplorer CentroMiner applied identify candidate telomere centromere as well their localizations each chromosome. These modules can be used alone or combination other assembly characterization. As case study, adopting entire modular functions have achieved
Language: Английский
Citations
107Cell, Journal Year: 2022, Volume and Issue: 185(15), P. 2828 - 2839
Published: May 27, 2022
Language: Английский
Citations
97Molecular Plant, Journal Year: 2023, Volume and Issue: 16(8), P. 1232 - 1236
Published: Aug. 1, 2023
In 2005, the current commonly used rice reference genome (Oryza sativa ssp. japonica cv. Nipponbare) was initially released by International Rice Genome Sequencing Project (International Project, 2005International ProjectThe map-based sequence of genome.Nature. 2005; 436: 793-800https://doi.org/10.1038/nature03895Crossref PubMed Scopus (3053) Google Scholar). Thereafter, further updated in 2013 with improved assembly (IRGSP-1.0) and gene annotations (MSU7, RAP-DB) (Kawahara et al., 2013Kawahara Y. de la Bastide M. Hamilton J.P. Kanamori H. McCombie W.R. Ouyang S. Schwartz D.C. Tanaka T. Wu J. Zhou al.Improvement Oryza Nipponbare using next generation optical map data.Rice. 2013; 6: 4https://doi.org/10.1186/1939-8433-6-4Crossref (1108) Scholar; Sakai 2013Sakai Lee S.S. Numa Kim Kawahara Wakimoto Yang C.C. Iwamoto Abe al.Rice Annotation Database (RAP-DB): an integrative interactive database for genomics.Plant Cell Physiol. 54: e6https://doi.org/10.1093/pcp/pcs183Crossref (489) past 10 years, this has been serving as one most important genetic resources subsequent functional genomics efforts. As several genomes had assembled into gapless chromosomes only 2–5 telomeres absent (Li 2021Li K. Jiang W. Hui Kong Feng L.Y. Gao L.Z. Li P. Lu Gapless indica reveals synergistic contributions active transposable elements segmental duplications to evolution.Mol. Plant. 2021; 14: 1745-1756https://doi.org/10.1016/j.molp.2021.06.017Abstract Full Text PDF (31) Song 2021Song J.M. Xie W.Z. Wang Guo Y.X. Koo D.H. Kudrna D. Gong C. Huang J.W. Zhang al.Two gap-free a global view centromere architecture rice.Mol. 1757-1767https://doi.org/10.1016/j.molp.2021.06.018Abstract (77) 2022Zhang Fu Han X. Yan Su Lin Z. Qin al.The telomere-to-telomere four parents SV PAV patterns hybrid breeding.Plant Biotechnol. 2022; 20: 1642-1644https://doi.org/10.1111/pbi.13880Crossref (13) Scholar), IRGSP-1.0 its still performed widely reference. However, limitations sequencing technology intricate genomic organization led under-representation complex regions reference, leaving total 72 major gaps (including 19 telomeres), 167 minor gaps, 779 unknown bases estimated length ∼3% unsolved. To pursue complete foundational genome, we applied strategy that integrated Pacbio HiFi Oxford Nanopore Technology (ONT) ultra-long reads generate original contigs, which were then scaffolded onto chromosome-level support Hi-C dataset. Gap filling terminal extension conducted resolve remaining seven telomere region within scaffolds. All gap-closure supported uniform coverage ONT (Supplemental Figure 1). A large rDNA array identified beside short arm chromosome 9 nearly identical repeats 45S 2), artificially filled consecutive blocks reflecting their copy number (see supplemental materials methods). This captured 93.8% 93.9% containing full-length mapping, but should be treated model sequences. Following polishing employing Illumina PE (next-generation [NGS]) reads, produced T2T-NIP (version AGIS-1.0), all 12 24 resolved (Figure 1A). Multiple strategies evaluate accuracy completeness T2T-NIP. available primary data—including HiFi, ONT, NGS, Hi-C—were remapped high mapping rates >99.6% datasets except (93.1%). displayed across whole dataset because centromeres near two 1B). Chromatin immunoprecipitation (ChIP-seq) CENH3 antibody identify location 1A, Supplemental Table 1, 3). CentO-enriched also homology 155- 165-bp CentO satellite 1A 1), eight showed similar or consistent size previous report determined fluorescence situ hybridization (Cheng 2002Cheng Dong F. Langdon Buell C.R. Gu Blattner F.R. Functional are marked repeat centromere-specific retrotransposon.Plant Cell. 2002; 1691-1704https://doi.org/10.1105/tpc.003079Crossref (321) The consensus approximately error per 5 million (Q63), much higher 2). For content assessment, 99.88% BUSCO 1614 set 3), equal than previously reported 1747 ribosomal RNA (rRNA) genes T2T-NIP, whereas hundred IRGSP-1.0. 57 359 protein-coding 325 794 (51.1%) identified, both represent more Tables 4 5). array, 1022 annotated transcriptome data 6). Among 314 gap-filling excluding 142 confirmed expressed tissue-specific 4). With achieved 385.7 base pairs (Mbp), including abundant improvements compared prior 4–6). Compared IRGSP-1.0, contains 12.5 Mbp newly sequence, arrays (33.2%), pericentromeric centromeric (32.1%), (27.1%), subtelomeric (5.1%), necessary fundamental cellular processes 1C–1E). Some largest covered nine chromosomes, telomeric repetitive three represented unresolved sequences 7). addition these apparent other gap found artificial otherwise incorrect 8). We investigated possible 500 kb flanking adjacent far from (39/44) excellent synteny while almost close (11/12) contained additional extensive structural differences (e.g., deletions inversions lengths >20 kb) 1D). Additionally, could well resulting continuous 100–117 1D These results demonstrated significant update resolving misassembled structures probably caused removes long-standing barrier hidden 3% sequence-based analysis, regions. Therefore, it is describe initial analysis truly discuss potential applications. have rich collection omics models transposon (TEs), sequencing, methylation datasets, presented online (http://www.ricesuperpir.com/web/nip). highlight utility resources, demonstrate examples duplicated 11 associated gaps. AGIS_Os10g035850 (denoted LOC_Os10g43075 IRGSP-1.0/MSU7) traversed boundary at 10, incomplete annotation 76.3% entire some misannotated exons version. thus correction model, six new each splicing alternatives Most TE-related multiple copies (paralogs) sequences, always complicated analysis. When NGS absence paralogs causes incorrectly align LOC_Os11g12240 (AGIS_Os11g010790), many false-positive variants 1F). mapped show expected typical heterozygous variation pattern small region. Any paralogs, others like them, will overlooked when thereby promoting importance release investigate how affects short-read variant calling, collected 230 cultivated sativa) wild rufipogon) accessions our study (Shang 2022Shang L. He Yuan Q. Wei Hu Zhao al.A super pan-genomic landscape rice.Cell Res. 32: 878-896https://doi.org/10.1038/s41422-022-00685-zCrossref (39) consisted populations: Xian/indica (XI), Geng/japonica (GJ), Aus (cA). same pipeline calling based on eliminate interferences software parameters. On average, BWA-MEM 1.04 × 107 (6.9%) properly paired Interestingly, even though per-read mismatch rate 1.2%–8.2% lower populations 1G). Similarly, characteristics such reducing misoriented read 1H) improving uniformity 1I) Within regions, noted decrease 2.0%–4.3% standard deviation analogous among population groups 1I). From alignments, 741 895 221 high-quality single-nucleotide indel relative (per-sample mean, 3 225 631) 744 667 800 237 686), observing shared called individual 6 9). Along improvement rate, attribute reduction per-sample calls errors, especially resolution correct conclusion observation sample decreased largely homozygous slight increase GJ superiority accurate reads. Next, effects (SV) published long Alignment reduced observed 1J) 1K) populations. corrected errors facilitated alignment, what S10). results, (from −16.3% −4.6%) SVs different against instead Similar variations above, those 7), likely due rare supplement phenotype genome-wide association studies (GWASs) assess efficiency 101 SNPs five agronomic traits, detected example, pleiotropic locus related yield plant 1 (qYPP1) significantly grain height not 1L–1M Gene-editing experiments screening revealed between plants type function-loss mutation encoding subunit ADP-glucose pyrophosphorylase, OsAGPL2 1N favorable haplotype showing (44.7 ± 11.8 g) haplotypes 1O). T2T-NIP-specific width enhanced mining summary, assembly, addressing missing information, represents resource. introduced ∼12.5 1324 predictions, include arrays, subtelomeres, unlocking variational studies. raw deposited National Center Biotechnology Information under project accession PRJNA953663 Genomics Data PRJCA018610. browser can easily accessed website research Natural Science Foundation China (32188102, 32101718), Guangdong Basic Applied Research (2023B1515020053), Youth Innovation Chinese Academy Agricultural Sciences (Y20230C36), specific fund Platform Academicians Hainan Province (YSPTZX202303).
Language: Английский
Citations
88Plant Biotechnology Journal, Journal Year: 2023, Volume and Issue: 21(5), P. 1022 - 1032
Published: Jan. 23, 2023
Summary Brassica rapa comprises many important cultivated vegetables and oil crops. However, Chiifu v3.0, the current B. reference genome, still contains hundreds of gaps. Here, we presented a near‐complete genome assembly v4.0, which was 424.59 Mb with only two gaps, using Oxford Nanopore Technology (ONT) ultralong‐read sequencing Hi‐C technologies. The new 12 contigs, contig N50 38.26 Mb. Eight ten chromosomes were entirely reconstructed in single from telomere to telomere. We found that centromeres mainly invaded by ALE CRM long terminal repeats (LTRs). Moreover, there is high divergence centromere length sequence among genomes. further are enriched for Copia at 0.14 MYA on average, while pericentromeres Gypsy LTRs 0.51 average. These results indicated different invasion mechanisms between structures. In addition, novel repetitive PCR630 identified . Overall, assembly, offers valuable tools genomic genetic studies species provides insights into evolution centromeres.
Language: Английский
Citations
54Horticulture Research, Journal Year: 2023, Volume and Issue: 10(4)
Published: Feb. 20, 2023
Fragaria vesca, commonly known as wild or woodland strawberry, is the most widely distributed diploid species and native to Europe Asia. Because of its small plant size, low heterozygosity, relative ease genetic transformation, F. vesca has been a model for fruit research since publication Illumina-based genome in 2011. However, genomic contribution octoploid cultivated strawberry remains long-standing question. Here, we de novo assembled annotated telomere-to-telomere, gap-free 'Hawaii 4', with all seven chromosomes into single contigs, providing highest completeness assembly quality date. The 220 785 082 bp length encodes 36 173 protein-coding gene models, including 1153 newly genes. All 14 telomeres centromeres were within chromosomes. Among three previously recognized ancestors, iinumae, viridis, phylogenomic analysis showed that viridis are ancestors × ananassa, closest relative. Three subgenomes ananassa belong group, one sister viridis. We anticipate this high-quality, genome, combined our inference origin will provide insight evolution facilitate genetics molecular breeding.
Language: Английский
Citations
52Nature Plants, Journal Year: 2024, Volume and Issue: 10(4), P. 551 - 566
Published: March 20, 2024
Language: Английский
Citations
32