An ecosystem for producing and sharing metadata within the web of FAIR Data DOI Creative Commons
Daniel Jacob, François Ehrenmann, Romain David

et al.

GigaScience, Journal Year: 2025, Volume and Issue: 14

Published: Jan. 1, 2025

Abstract Background Descriptive metadata are vital for reporting, discovering, leveraging, and mobilizing research datasets. However, resolving issues as part of a data management plan can be complex producers. To organize document data, various descriptive must created. Furthermore, when sharing it is important to ensure interoperability in line with FAIR (Findable, Accessible, Interoperable, Reusable) principles. Given the practical nature these challenges, there need tools that assist managers effectively. Additionally, should meet needs producers user-friendly, requiring minimal training. Results We developed Maggot (Metadata Aggregation on Data Storage), web-based tool locally manage catalog using high-level metadata. The main goal was facilitate easy dissemination deposition repositories. With Maggot, users easily generate attach datasets, allowing seamless collaborative environment. This approach aligns many plans effectively addresses challenges related organization, documentation, storage, based principles within beyond group. enables crosswalks (i.e., generated converted schema used by specific repository or exported format suitable collection third-party applications). Conclusion primary purpose streamline carefully chosen schemas standards. simplifies accessibility via metadata, typically requirement publicly funded projects. As result, utilized promote effective local facilitating while adhering contribute preparation future EOSC Web European Open Science Cloud framework.

Language: Английский

The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update DOI Creative Commons

Linelle Ann L Abueg,

Enis Afgan,

Olivier Allart

et al.

Nucleic Acids Research, Journal Year: 2024, Volume and Issue: 52(W1), P. W83 - W94

Published: May 20, 2024

Abstract Galaxy (https://galaxyproject.org) is deployed globally, predominantly through free-to-use services, supporting user-driven research that broadens in scope each year. Users are attracted to public services by platform stability, tool and reference dataset diversity, training, support integration, which enables complex, reproducible, shareable data analysis. Applying the principles of user experience design (UXD), has driven improvements accessibility, discoverability Labs/subdomains, a redesigned ToolShed. capabilities progressing two strategic directions: integrating general purpose graphical processing units (GPGPU) access for cutting-edge methods, licensed support. Engagement with global consortia being increased developing more workflows resourcing run them. The Training Network (GTN) portfolio grown both size, learning paths direct integration tools feature training courses. Code development continues line Project roadmap, job scheduling interface. Environmental impact assessment also helping engage users developers, reminding them their role sustainability, displaying estimated CO2 emissions generated job.

Language: Английский

Citations

177

Challenges and opportunities in sharing microbiome data and analyses DOI Open Access
Curtis Huttenhower, ROBERT FINN, Alice C. McHardy

et al.

Nature Microbiology, Journal Year: 2023, Volume and Issue: 8(11), P. 1960 - 1970

Published: Oct. 2, 2023

Language: Английский

Citations

26

Applying the FAIR Principles to computational workflows DOI Creative Commons
Sean Wilkinson, Meznah Aloqalaa, Khalid Belhajjame

et al.

Scientific Data, Journal Year: 2025, Volume and Issue: 12(1)

Published: Feb. 24, 2025

Recent trends within computational and data sciences show an increasing recognition adoption of workflows as tools for productivity reproducibility that also democratize access to platforms processing know-how. As digital objects be shared, discovered, reused, benefit from the FAIR principles, which stand Findable, Accessible, Interoperable, Reusable. The Workflows Community Initiative's Working Group (WCI-FW), a global open community researchers developers working with across disciplines domains, has systematically addressed application both software principles workflows. We present recommendations commentary reflects our discussions justifies choices adaptations. These are offered workflow users authors, management system developers, providers services guidelines fodder discussion. we propose in this paper will maximize their value research assets facilitate by wider community.

Language: Английский

Citations

2

Combining hypothesis- and data-driven neuroscience modeling in FAIR workflows DOI Creative Commons
Olivia Eriksson, Upinder S. Bhalla, Kim T. Blackwell

et al.

eLife, Journal Year: 2022, Volume and Issue: 11

Published: July 6, 2022

Modeling in neuroscience occurs at the intersection of different points view and approaches. Typically, hypothesis-driven modeling brings a question into focus so that model is constructed to investigate specific hypothesis about how system works or why certain phenomena are observed. Data-driven modeling, on other hand, follows more unbiased approach, with construction informed by computationally intensive use data. At same time, researchers employ models biological scales levels abstraction. Combining these while validating them against experimental data increases understanding multiscale brain. However, lack interoperability, transparency, reusability both workflows used construct creates barriers for integration representing built using philosophies. We argue imperatives drive resources policy - such as FAIR (Findable, Accessible, Interoperable, Reusable) principles also support The require be shared formats Findable, Reusable. Applying workflows, well constrain validate them, would allow find, reuse, question, validate, extend published models, regardless whether they implemented phenomenologically mechanistically, few equations multiscale, hierarchical system. To illustrate ideas, we classical synaptic plasticity model, Bienenstock-Cooper-Munro rule, an example due its long history, abstraction, implementation many scales.

Language: Английский

Citations

30

Ten quick tips for building FAIR workflows DOI Creative Commons
Casper de Visser, Lennart Johansson, Purva Kulkarni

et al.

PLoS Computational Biology, Journal Year: 2023, Volume and Issue: 19(9), P. e1011369 - e1011369

Published: Sept. 28, 2023

Research data is accumulating rapidly and with it the challenge of fully reproducible science. As a consequence, implementation high-quality management scientific has become global priority. The FAIR (Findable, Accesible, Interoperable Reusable) principles provide practical guidelines for maximizing value research data; however, processing using workflows-systematic executions series computational tools-is equally important good management. have recently been adapted to Software (FAIR4RS Principles) promote reproducibility reusability any type software. Here, we propose set 10 quick tips, drafted by experienced workflow developers that will help researchers apply FAIR4RS workflows. tips arranged according acronym, clarifying purpose each tip respect principles. Altogether, these can be seen as who aim contribute more sustainable science, aiming positively impact open science community.

Language: Английский

Citations

19

Croissant: A Metadata Format for ML-Ready Datasets DOI
Mubashara Akhtar, Omar Benjelloun, Costanza Conforti

et al.

Published: May 29, 2024

Data is a critical resource for Machine Learning (ML), yet working with data remains key friction point. This paper introduces Croissant, metadata format datasets that simplifies how used by ML tools and frameworks. Croissant makes more discoverable, portable interoperable, thereby addressing significant challenges in management responsible AI. already supported several popular dataset repositories, spanning hundreds of thousands datasets, ready to be loaded into the most

Language: Английский

Citations

8

Recording provenance of workflow runs with RO-Crate DOI Creative Commons
Simone Leo, Michael R. Crusoe, Laura Rodríguez‐Navas

et al.

PLoS ONE, Journal Year: 2024, Volume and Issue: 19(9), P. e0309210 - e0309210

Published: Sept. 10, 2024

Recording the provenance of scientific computation results is key to support traceability, reproducibility and quality assessment data products. Several models have been explored address this need, providing representations workflow plans their executions as well means packaging resulting information for archiving sharing. However, existing approaches tend lack interoperable adoption across management systems. In work we present Workflow Run RO-Crate, an extension RO-Crate (Research Object Crate) Schema.org capture execution computational workflows at different levels granularity bundle together all associated objects (inputs, outputs, code, etc.). The model supported by a diverse, open community that runs regular meetings, discussing development, maintenance aspects. already implemented several systems, allowing comparisons between from heterogeneous We describe model, its alignment standards such W3C PROV, implementation in six Finally, illustrate application two use cases machine learning digital image analysis domain.

Language: Английский

Citations

8

Playbook workflow builder: Interactive construction of bioinformatics workflows DOI Creative Commons
Daniel Clarke, John Erol Evangelista, Zhuorui Xie

et al.

PLoS Computational Biology, Journal Year: 2025, Volume and Issue: 21(4), P. e1012901 - e1012901

Published: April 3, 2025

The Playbook Workflow Builder (PWB) is a web-based platform to dynamically construct and execute bioinformatics workflows by utilizing growing network of input datasets, semantically annotated API endpoints, data visualization tools contributed an ecosystem collaborators. Via user-friendly user interface, can be constructed from building-blocks without technical expertise. output each step the workflow added into reports containing textual descriptions, figures, tables, references. To workflows, users click on cards that represent in workflow, or via chat interface assisted large language model (LLM). Completed are compatible with Common Language (CWL) published as research publications, slideshows, posters. demonstrate how PWB generates meaningful hypotheses draw knowledge across multiple resources, we present several use cases. For example, one these cases prioritizes drug targets for individual cancer patients using NIH Fund programs GTEx, LINCS, Metabolomics, GlyGen, ExRNA. created repurposed tackle similar different inputs. available from: https://playbook-workflow-builder.cloud/ .

Language: Английский

Citations

1

WorkflowHub: a registry for computational workflows DOI Creative Commons
Ove Gustafsson, Sean Wilkinson, Finn Bacall

et al.

Scientific Data, Journal Year: 2025, Volume and Issue: 12(1)

Published: May 21, 2025

Abstract The rising popularity of computational workflows is driven by the need for repetitive and scalable data processing, sharing processing know-how, transparent methods. As both combined records analysis descriptions steps, should be reproducible, reusable, adaptable, available. Workflow presents opportunities to reduce unnecessary reinvention, promote reuse, increase access best practice analyses non-experts, productivity. In reality, are scattered difficult find, in part due diversity available workflow engines ecosystems, because not yet research practice. WorkflowHub provides a unified registry all that links community repositories, supports lifecycle making findable, accessible, interoperable, reusable (FAIR). By interoperating with diverse platforms, services, external registries, adds value supporting sharing, explicitly assigning credit, enhancing FAIRness, promoting as scholarly artefacts. has global reach, hundreds organisations involved, more than 800 registered.

Language: Английский

Citations

1

“Be sustainable”: EOSC‐Life recommendations for implementation of FAIR principles in life science data handling DOI Creative Commons
Romain David, Arina Rybina, Jean‐Marie Burel

et al.

The EMBO Journal, Journal Year: 2023, Volume and Issue: 42(23)

Published: Nov. 15, 2023

The main goals and challenges for the life science communities in Open Science framework are to increase reuse sustainability of data resources, software tools, workflows, especially large-scale data-driven research computational analyses. Here, we present key findings, procedures, effective measures recommendations generating establishing sustainable resources based on collaborative, cross-disciplinary work done within EOSC-Life (European Cloud Life Sciences) consortium. Bringing together 13 European infrastructures, it has laid foundation an open, digital space support biological medical research. Using lessons learned from 27 selected projects, describe organisational, technical, financial legal/ethical that represent barriers sciences. We show how provides a model management according FAIR (findability, accessibility, interoperability, reusability) principles, including solutions sensitive- industry-related by means training best practices sharing. Finally, illustrate harmonisation collaborative facilitate interoperability data, lead better understanding concepts, semantics functionalities

Language: Английский

Citations

16