Cited by The Ontology of Biological Attributes (OBA)—computational traits for the life sciences

The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update DOI

Linelle Ann L Abueg,

Enis Afgan,

Olivier Allart

et al.

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 52(W1), P. W83 - W94

Published: May 20, 2024

Abstract Galaxy (https://galaxyproject.org) is deployed globally, predominantly through free-to-use services, supporting user-driven research that broadens in scope each year. Users are attracted to public services by platform stability, tool and reference dataset diversity, training, support integration, which enables complex, reproducible, shareable data analysis. Applying the principles of user experience design (UXD), has driven improvements accessibility, discoverability Labs/subdomains, a redesigned ToolShed. capabilities progressing two strategic directions: integrating general purpose graphical processing units (GPGPU) access for cutting-edge methods, licensed support. Engagement with global consortia being increased developing more workflows resourcing run them. The Training Network (GTN) portfolio grown both size, learning paths direct integration tools feature training courses. Code development continues line Project roadmap, job scheduling interface. Environmental impact assessment also helping engage users developers, reminding them their role sustainability, displaying estimated CO2 emissions generated job.

Language: Английский

Citations

177

Packaging research artefacts with RO-Crate DOI

Stian Soiland‐Reyes, Peter Sefton, Mercè Crosas

et al.

Data Science, Journal Year: 2022, Volume and Issue: 5(2), P. 97 - 138

Published: Jan. 4, 2022

An increasing number of researchers support reproducibility by including pointers to and descriptions datasets, software methods in their publications. However, scientific articles may be ambiguous, incomplete difficult process automated systems. In this paper we introduce RO-Crate, an open, community-driven, lightweight approach packaging research artefacts along with metadata a machine readable manner. RO-Crate is based on Schema$.$org annotations JSON-LD, aiming establish best practices formally describe accessible practical way for use wide variety situations. structured archive all the items that contributed outcome, identifiers, provenance, relations annotations. As general purpose data metadata, used across multiple areas, bioinformatics, digital humanities regulatory sciences. By applying "just enough" Linked Data standards, simplifies making outputs FAIR while also enhancing reproducibility. article available at https://w3id.org/ro/doi/10.5281/zenodo.5146227

Language: Английский

Citations

122

Inverting the model of genomics data sharing with the NHGRI Genomic Data Science Analysis, Visualization, and Informatics Lab-space DOI

Michael C. Schatz, Anthony Philippakis, Enis Afgan

et al.

Cell Genomics, Journal Year: 2022, Volume and Issue: 2(1), P. 100085 - 100085

Published: Jan. 1, 2022

Language: Английский

Citations

104

Genomic newborn screening for rare diseases DOI

Zornitza Stark, Richard H. Scott

Nature Reviews Genetics, Journal Year: 2023, Volume and Issue: 24(11), P. 755 - 766

Published: June 29, 2023

Language: Английский

Citations

102

The GA4GH Phenopacket schema defines a computable representation of clinical data DOI

Julius O.B. Jacobsen, Michael Baudis, Gareth Baynam

et al.

Nature Biotechnology, Journal Year: 2022, Volume and Issue: 40(6), P. 817 - 820

Published: June 1, 2022

Language: Английский

Citations

Sequence modeling and design from molecular to genome scale with Evo DOI

Eric Nguyen,

Michael Poli, Matthew G. Durrant

et al.

Science, Journal Year: 2024, Volume and Issue: 386(6723)

Published: Nov. 14, 2024

The genome is a sequence that encodes the DNA, RNA, and proteins orchestrate an organism’s function. We present Evo, long-context genomic foundation model with frontier architecture trained on millions of prokaryotic phage genomes, report scaling laws DNA to complement observations in language vision. Evo generalizes across proteins, enabling zero-shot function prediction competitive domain-specific models generation functional CRISPR-Cas transposon systems, representing first examples protein-RNA protein-DNA codesign model. also learns how small mutations affect whole-organism fitness generates megabase-scale sequences plausible architecture. These capabilities span molecular scales complexity, advancing our understanding control biology.

Language: Английский

Citations

Sequence modeling and design from molecular to genome scale with Evo DOI

Éric Nguyen, Michael Poli, Matthew G. Durrant

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Feb. 27, 2024

The genome is a sequence that completely encodes the DNA, RNA, and proteins orchestrate function of whole organism. Advances in machine learning combined with massive datasets genomes could enable biological foundation model accelerates mechanistic understanding generative design complex molecular interactions. We report Evo, genomic enables prediction generation tasks from to scale. Using an architecture based on advances deep signal processing, we scale Evo 7 billion parameters context length 131 kilobases (kb) at single-nucleotide, byte resolution. Trained prokaryotic genomes, can generalize across three fundamental modalities central dogma biology perform zero-shot competitive with, or outperforms, leading domain-specific language models. also excels multi-element tasks, which demonstrate by generating synthetic CRISPR-Cas complexes entire transposable systems for first time. information learned over predict gene essentiality nucleotide resolution generate coding-rich sequences up 650 kb length, orders magnitude longer than previous methods. multi-modal multi-scale provides promising path toward improving our control multiple levels complexity.

Language: Английский

Citations

Australian Genomics: Outcomes of a 5-year national program to accelerate the integration of genomics in healthcare DOI

Zornitza Stark, Tiffany Boughtwood, Matilda Haas

et al.

The American Journal of Human Genetics, Journal Year: 2023, Volume and Issue: 110(3), P. 419 - 426

Published: March 1, 2023

Language: Английский

Citations

PRECISION MEDICINE AND GENOMICS: A COMPREHENSIVE REVIEW OF IT-ENABLED APPROACHES DOI

Francisca Chibugo Udegbe,

Ogochukwu Roseline Ebulue,

Charles Chukwudalu Ebulue

et al.

International Medical Science Research Journal, Journal Year: 2024, Volume and Issue: 4(4), P. 509 - 520

Published: April 20, 2024

This review delves into Information Technology's (IT) transformative impact on precision medicine and genomics, spotlighting the pivotal role of bioinformatics, data mining, machine learning, blockchain technologies in advancing personalized healthcare. A comprehensive analysis outlines how these IT-enabled approaches facilitate analysis, interpretation, application vast genomic sets, thereby enhancing disease prediction, diagnosis, treatment an individual level. Despite promising advancements, also addresses significant challenges, including complexity, interoperability, ethical considerations, digital divide, underscoring necessity for multidisciplinary collaboration innovation to overcome hurdles. The paper concludes by emphasizing potential emerging patient-centred care realizing vision medicine, which promises improved healthcare outcomes through strategies. Keywords: Precision Medicine, Genomics, Bioinformatics, Machine Learning, Data Security.

Language: Английский

Citations

Modeling methyl-sensitive transcription factor motifs with an expanded epigenetic alphabet DOI

Coby Viner, Charles A. Ishak, James Johnson

et al.

Genome biology, Journal Year: 2024, Volume and Issue: 25(1)

Published: Jan. 8, 2024

Abstract Background Transcription factors bind DNA in specific sequence contexts. In addition to distinguishing one nucleobase from another, some transcription can distinguish between unmodified and modified bases. Current models of factor binding tend not take modifications into account, while the recent few that do often have limitations. This makes a comprehensive accurate profiling affinities difficult. Results Here, we develop methods identify sites DNA. Our expand standard /// alphabet include cytosine modifications. We Cytomod create genomic sequences also enhance MEME Suite, adding capacity handle custom alphabets. adapt well-established position weight matrix (PWM) model affinity this expanded alphabet. Using these methods, modification-sensitive motifs. confirm established preferences, such as preference ZFP57 C/EBPβ for methylated motifs c-Myc unmethylated E-box Conclusions known preferences tune parameters, discover novel wide array factors. Finally, validate our predictions OCT4 using cleavage under targets release nuclease (CUT&RUN) experiments across conventional, methylation-, hydroxymethylation-enriched sequences. approach readily extends other As more genome-wide single-base resolution modification data becomes available, expect method will yield insights altered many different

Language: Английский

Citations