Introgressions lead to reference bias in wheat RNA-seq analysis DOI Creative Commons
Benedict Coombes, Thomas Lux, Eduard Akhunov

и другие.

BMC Biology, Год журнала: 2024, Номер 22(1)

Опубликована: Март 7, 2024

Abstract Background RNA-seq is a fundamental technique in genomics, yet reference bias, where transcripts derived from non-reference alleles are quantified less accurately, can undermine the accuracy of quantification and thus conclusions made downstream. Reference bias analysis has to be explored complex polyploid genomes despite evidence that they often mosaic wild relative introgressions, which introduce blocks highly divergent genes. Results Here we use hexaploid wheat as model polyploid, using both simulated experimental data show alignment suffers widespread largely driven by introgressed This leads underestimation gene expression incorrect assessment homoeologue balance. By incorporating models ten genome assemblies into pantranscriptome reference, present novel method reduce readily scaled capture more variation new transcriptome becomes available. Conclusions study shows presence introgressions lead analysis. Caution should exercised researchers non-sample for methods, such one presented here, considered.

Язык: Английский

g:Profiler—interoperable web service for functional enrichment analysis and gene identifier mapping (2023 update) DOI Creative Commons
Liis Kolberg, Uku Raudvere, Ivan Kuzmin

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 51(W1), С. W207 - W212

Опубликована: Май 5, 2023

g:Profiler is a reliable and up-to-date functional enrichment analysis tool that supports various evidence types, identifier types organisms. The toolset integrates many databases, including Gene Ontology, KEGG TRANSFAC, to provide comprehensive in-depth of gene lists. It also provides interactive intuitive user interfaces ordered queries custom statistical backgrounds, among other settings. multiple programmatic access its functionality. These can be easily integrated into workflows external tools, making them valuable resources for researchers who want develop their own solutions. has been available since 2007 used analyse millions queries. Research reproducibility transparency are achieved by maintaining working versions all past database releases 2015. 849 species, vertebrates, plants, fungi, insects parasites, any organism through user-uploaded annotation files. In this update article, we introduce novel filtering method highlighting Ontology driver terms, accompanied new graph visualizations providing broader context significant terms. As leading list interoperability service, offers resource genetics, biology medical researchers. freely accessible at https://biit.cs.ut.ee/gprofiler.

Язык: Английский

Процитировано

587

The gut mycobiome in health, disease, and clinical applications in association with the gut bacterial microbiome assembly DOI Creative Commons
Fen Zhang, Dominik Aschenbrenner, Ji Youn Yoo

и другие.

The Lancet Microbe, Год журнала: 2022, Номер 3(12), С. e969 - e983

Опубликована: Сен. 29, 2022

Язык: Английский

Процитировано

211

Update on the proposed minimal standards for the use of genome data for the taxonomy of prokaryotes DOI
Raúl Riesco, Martha E. Trujillo

INTERNATIONAL JOURNAL OF SYSTEMATIC AND EVOLUTIONARY MICROBIOLOGY, Год журнала: 2024, Номер 74(3)

Опубликована: Март 21, 2024

The field of microbial taxonomy is dynamic, aiming to provide a stable and contemporary classification system for prokaryotes. Traditionally, reliance on phenotypic characteristics limited the comprehensive understanding diversity evolution. introduction molecular techniques, particularly DNA sequencing genomics, has transformed our perception prokaryotic diversity. In past two decades, advancements in genome have transitioned from traditional methods genome-based taxonomic framework, not only define species, but also higher ranks. As technology databases rapidly expand, maintaining updated standards crucial. This work seeks revise 2018 guidelines applying data taxonomy, adapting minimal recommendations reflect technological progress during this period.

Язык: Английский

Процитировано

143

Prediction of effector protein structures from fungal phytopathogens enables evolutionary analyses DOI Creative Commons
Kyungyong Seong, Ksenia V. Krasileva

Nature Microbiology, Год журнала: 2023, Номер 8(1), С. 174 - 187

Опубликована: Янв. 5, 2023

Elucidating the similarity and diversity of pathogen effectors is critical to understand their evolution across fungal phytopathogens. However, rapid divergence that diminishes sequence similarities between putatively homologous has largely concealed roots effector evolution. Here we modelled structures 26,653 secreted proteins from 14 agriculturally important phytopathogens, six non-pathogenic fungi one oomycete with AlphaFold 2. With 18,000 successfully predicted folds, performed structure-guided comparative analyses on two aspects evolution: uniquely expanded sequence-unrelated structurally similar (SUSS) families common folds present species. Extreme expansion lineage-specific SUSS was found only in several obligate biotrophs, Blumeria graminis Puccinia graminis. The highly were source conserved motifs, such as Y/F/WxC motif. We identified new classes include known virulence factors, AvrSr35, AvrSr50 Tin2. Structural comparisons revealed structural further diversify through domain duplications fusion disordered stretches. Putatively sub- neo-functionalized could reconverge regulation, expanding functional pools infection cycle. also evidence many have originated ancestral fungi. Collectively, our study highlights diverse mechanisms supports divergent a major force driving proteins.

Язык: Английский

Процитировано

124

RSAT 2022: regulatory sequence analysis tools DOI Creative Commons
Walter Santana-Garcia, Jaime A. Castro-Mondragón,

Mónica Padilla-Gálvez

и другие.

Nucleic Acids Research, Год журнала: 2022, Номер 50(W1), С. W670 - W676

Опубликована: Апрель 20, 2022

RSAT (Regulatory Sequence Analysis Tools) enables the detection and analysis of cis-regulatory elements in genomic sequences. This software suite performs (i) de novo motif discovery (including from genome-wide datasets like ChIP-seq/ATAC-seq) (ii) sequences scanning with known motifs, (iii) (quality assessment, comparisons clustering), (iv) regulatory variations (v) comparative genomics. comprises 50 tools. Six public Web servers a teaching server) are offered to meet needs different biological communities. philosophy originality are: multi-modal access depending on user needs, through web forms, command-line for local installation programmatic services, support virtually any genome (animals, bacteria, plants, totalizing over 10 000 genomes directly accessible). Since 2018 NAR Software Issue, we have developed large REST API, extended additional external collections, enhanced some tools novel tool that builds or refine gene networks using (network-interactions). The website provides extensive documentation, tutorials published protocols. code is under open-source license now hosted GitHub. available at http://www.rsat.eu/.

Язык: Английский

Процитировано

95

Quality assessment of gene repertoire annotations with OMArk DOI Creative Commons
Yannis Nevers, Alex Warwick Vesztrocy, Victor Rossier

и другие.

Nature Biotechnology, Год журнала: 2024, Номер unknown

Опубликована: Фев. 21, 2024

Abstract In the era of biodiversity genomics, it is crucial to ensure that annotations protein-coding gene repertoires are accurate. State-of-the-art tools assess genome measure completeness a repertoire but blind other errors, such as overprediction or contamination. We introduce OMArk, software package relies on fast, alignment-free sequence comparisons between query proteome and precomputed families across tree life. OMArk assesses not only also consistency whole relative closely related species reports likely contamination events. Analysis 1,805 UniProt Eukaryotic Reference Proteomes with demonstrated strong evidence in 73 proteomes identified error propagation avian annotation resulting from use fragmented zebra finch reference. This study illustrates importance comparing prioritizing based their quality measures.

Язык: Английский

Процитировано

46

PlantPAN 4.0: updated database for identifying conserved non-coding sequences and exploring dynamic transcriptional regulation in plant promoters DOI Creative Commons

Chi-Nga Chow,

Chien-Wen Yang,

Nai-Yun Wu

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 52(D1), С. D1569 - D1578

Опубликована: Окт. 28, 2023

Abstract PlantPAN 4.0 (http://PlantPAN.itps.ncku.edu.tw/) is an integrative resource for constructing transcriptional regulatory networks diverse plant species. In this release, the gene annotation and promoter sequences were expanded to cover 115 can help users characterize evolutionary differences similarities among cis-regulatory elements; furthermore, system now in identification of conserved non-coding homologous genes. The updated transcription factor binding site repository contains 3428 nonredundant matrices 18305 factors; expansion helps exploration combinational nucleotide variants elements sequences. Additionally, genomic landscapes factors manually updated, ChIP-seq data sets derived from a single-cell green alga (Chlamydomonas reinhardtii) added. Furthermore, statistical review graphical analysis components improved offer intelligible information through analysis. These improvements included easy-to-read experimental condition clusters, searchable gene-centered interfaces regions’ preferences by considering clusters peak visualization all factors, 20 most significantly enriched ontology functions factors. Thus, effectively reconstruct compare across species experiments.

Язык: Английский

Процитировано

42

Plant pangenomes for crop improvement, biodiversity and evolution DOI
Mona Schreiber, Murukarthick Jayakodi, Nils Stein

и другие.

Nature Reviews Genetics, Год журнала: 2024, Номер 25(8), С. 563 - 577

Опубликована: Фев. 20, 2024

Язык: Английский

Процитировано

28

InterPro: the protein sequence classification resource in 2025 DOI Creative Commons
Matthias Blum, Antonina Andreeva,

Laise Cavalcanti Florentino

и другие.

Nucleic Acids Research, Год журнала: 2024, Номер 53(D1), С. D444 - D456

Опубликована: Ноя. 20, 2024

Abstract InterPro (https://www.ebi.ac.uk/interpro) is a freely accessible resource for the classification of protein sequences into families. It integrates predictive models, known as signatures, from multiple member databases to classify families and predict presence domains significant sites. The database provides annotations over 200 million sequences, ensuring extensive coverage UniProtKB, standard repository includes mappings several other major resources, such Gene Ontology (GO), Protein Data Bank in Europe (PDBe) AlphaFold Structure Database. In this publication, we report on status (version 101.0), detailing new developments database, associated web interface software. Notable updates include increased integration structures predicted by enhanced description using artificial intelligence. Over past two years, more than 5000 entries have been created. website now offers access 85 000 its serves long-term archive retired databases. data, software tools are available.

Язык: Английский

Процитировано

18

The European Bioinformatics Institute (EMBL-EBI) in 2021 DOI Creative Commons
Gaia Cantelli, Alex Bateman, Cath Brooksbank

и другие.

Nucleic Acids Research, Год журнала: 2021, Номер 50(D1), С. D11 - D19

Опубликована: Ноя. 23, 2021

Abstract The European Bioinformatics Institute (EMBL-EBI) maintains a comprehensive range of freely available and up-to-date molecular data resources, which includes over 40 resources covering every major type in the life sciences. This year's service update for EMBL-EBI new PGS Catalog AlphaFold DB, updates on existing including COVID-19 Data Platform, trRosetta RoseTTAfold models introduced Pfam InterPro, launch Genome Integrations with Function Sequence by UniProt Ensembl. Furthermore, we highlight projects through has contributed to development community-driven standards guidelines, Recommended Metadata Biological Images (REMBI), BioModels Reproducibility Scorecard. Training is one EMBL-EBI’s core missions key component provision bioinformatics services users: this many improvements that have been developed online training offering.

Язык: Английский

Процитировано

63