A teaching and training framework to promote findable, accessible, interoperable, and reusable data generation in agriculture DOI Creative Commons
Annarita Marrano, Leyla Cabugos, Alenka Hafner

et al.

Database, Journal Year: 2025, Volume and Issue: 2025

Published: Jan. 1, 2025

Abstract Advances in agricultural genetic, genomic, and breeding (GGB) technologies generate increasingly large complex datasets that need to be adequately managed shared. While several biological databases maintain curate GGB data, not all scientists are aware of them how they can used access share data. In addition, there is the increase scientists’ awareness appropriate data archiving curation increases longevity value bolsters scientific discoveries’ reproducibility transparency. The AgBioData Education working group aims address these unmet needs developed a modular curriculum for educators teaching basics findable, accessible, interoperable, reusable (FAIR) principles undergraduate graduate students (https://www.agbiodata.org/). present paper provides an overview topics covered within curriculum, called ‘AgBioData Curriculum Ag FAIR Data,’ its audience modalities, it will positively impact different stakeholders database ecosystem. We hope presented here help understand support use aspects improving our global food system. Database URL: https://zenodo.org/records/14278084

Language: Английский

Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package DOI Creative Commons
Emma Griffiths, Ruth Timme, Catarina Inês Mendes

et al.

GigaScience, Journal Year: 2022, Volume and Issue: 11

Published: Jan. 1, 2022

Abstract Background The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools resources, advocate greater openness, interoperability, accessibility, reproducibility in public health microbial bioinformatics. In face current pandemic, PHA4GE has identified need fit-for-purpose, open-source SARS-CoV-2 contextual data standard. Results As such, we have developed specification package based on harmonizable, publicly available community standards. can be implemented via collection template, as well an array protocols support both harmonization submission sequence information biorepositories. Conclusions Well-structured, rich add value, promote reuse, enable aggregation integration disparate datasets. Adoption proposed standard practices will better interoperability between datasets systems, consistency utility generated data, ultimately facilitate novel insights discoveries COVID-19. now supported by NCBI’s BioSample database.

Language: Английский

Citations

35

Challenges and Opportunities for Global Genomic Surveillance Strategies in the COVID-19 Era DOI Creative Commons
Ted Ling-Hu, Estefany Rios-Guzman, Ramon Lorenzo‐Redondo

et al.

Viruses, Journal Year: 2022, Volume and Issue: 14(11), P. 2532 - 2532

Published: Nov. 16, 2022

Global SARS-CoV-2 genomic surveillance efforts have provided critical data on the ongoing evolution of virus to inform best practices in clinical care and public health throughout pandemic. Impactful strategies generally follow a multi-disciplinary pipeline involving sample collection, viral genotyping, metadata linkage, reporting, responses. Unfortunately, current limitations each these steps compromised overall effectiveness strategies. Biases from convenience-based sampling methods can obfuscate true distribution circulating variants. The lack standardization genotyping bioinformatic expertise create bottlenecks processing complicate interpretation. Limitations inconsistencies demographic collection sharing slow compilation limit utility comprehensive datasets. This likewise restricting availability timely data. Finally, gaps delays implementation sphere prevent officials formulating effective mitigation outbreaks. In this review, we outline global assess roadblocks at step identify potential solutions. Evaluating obstacles that impede improve both coordination pandemic preparedness for future

Language: Английский

Citations

34

A focus groups study on data sharing and research data management DOI Creative Commons
Devan Ray Donaldson, Joshua Wolfgang Koepke

Scientific Data, Journal Year: 2022, Volume and Issue: 9(1)

Published: June 17, 2022

Abstract Data sharing can accelerate scientific discovery while increasing return on investment beyond the researcher or group that produced them. repositories enable data and preservation over long term, but little is known about scientists’ perceptions of them their perspectives management practices. Using focus groups with scientists from five disciplines (atmospheric earth science, computer chemistry, ecology, neuroscience), we asked questions to lead into a discussion what features they think are necessary include in repository systems services help implement parts plans. Participants identified metadata quality control training as problem areas management. Additionally, participants discussed several desired features, including: control, traceability, security, stable infrastructure, use restrictions. We present rubric for research community encourage utilization. Future directions discussed.

Language: Английский

Citations

30

Artificial intelligence‐based prediction of pathogen emergence and evolution in the world of synthetic biology DOI Creative Commons
Antoine Danchin

Microbial Biotechnology, Journal Year: 2024, Volume and Issue: 17(10)

Published: Oct. 1, 2024

The emergence of new techniques in both microbial biotechnology and artificial intelligence (AI) is opening up a completely field for monitoring sometimes even controlling the evolution pathogens. However, now famous generative AI extracts reorganizes prior knowledge from large datasets, making it poorly suited to predictions an unreliable future. In contrast, unfamiliar perspective can help us identify key issues related technologies, such as those arising synthetic biology, whilst revisiting old views or including generator abduction resource. This could enable dangerous situations that are bound emerge not-too-distant future, prepare ourselves anticipate when where they will occur. Here, we emphasize fact amongst many causes pathogen outbreaks, often driven by explosion human population, laboratory accidents major cause epidemics. review, limited animal pathogens, concludes with discussion potential epidemic origins based on unusual organisms associations have rarely been highlighted studied.

Language: Английский

Citations

5

A review on viral data sources and search systems for perspective mitigation of COVID-19 DOI Creative Commons
Anna Bernasconi, Arif Canakoglu, Marco Masseroli

et al.

Briefings in Bioinformatics, Journal Year: 2020, Volume and Issue: 22(2), P. 664 - 675

Published: Nov. 10, 2020

Abstract With the outbreak of COVID-19 disease, research community is producing unprecedented efforts dedicated to better understand and mitigate effects pandemic. In this context, we review data integration required for accessing searching genome sequences metadata SARS-CoV2, virus responsible which have been deposited into most important repositories viral sequences. Organizations that were already present in domain are now dedicating special interest emergence pandemics, by emphasizing specific SARS-CoV2 services. At same time, novel organizations resources born critical period serve specifically purposes mitigation while setting ground contrasting possible future pandemics. Accessibility sequence data, possibly conjunction with human host genotype clinical paramount disease its effects. Few examples host-pathogen integrated datasets exist so far, but expect them grow together knowledge disease; once such will be available, useful integrative surveillance mechanisms can put place observing how common variants distribute time space, relating phenotypic impact evidenced literature.

Language: Английский

Citations

33

Representing COVID-19 information in collaborative knowledge graphs: The case of Wikidata DOI Creative Commons
Houcemeddine Turki, Mohamed Ali Hadj Taieb, Thomas Shafee

et al.

Semantic Web, Journal Year: 2021, Volume and Issue: 13(2), P. 233 - 264

Published: Sept. 28, 2021

Information related to the COVID-19 pandemic ranges from biological bibliographic, geographical genetic and beyond. The structure of raw data is highly complex, so converting it meaningful insight requires curation, integration, extraction visualization, global crowdsourcing which provides both additional challenges opportunities. Wikidata an interdisciplinary, multilingual, open collaborative knowledge base more than 90 million entities connected by well over a billion relationships. It acts as web-scale platform for broader computer-supported cooperative work linked data, since can be written queried in multiple ways near real time specialists, automated tools public. main query language, SPARQL, semantic language used retrieve process information databases saved Resource Description Framework (RDF) format. Here, we introduce four aspects that enable serve general on pandemic: its flexible model, multilingual features, alignment external databases, multidisciplinary organization. rich graph created visualized, explored, analyzed purposes like decision support educational scholarly research.

Language: Английский

Citations

32

Improving the completeness of public metadata accompanying omics studies DOI Creative Commons
Anushka Rajesh, Yutong Chang, Malak Abedalthagafi

et al.

Genome biology, Journal Year: 2021, Volume and Issue: 22(1)

Published: April 15, 2021

Language: Английский

Citations

28

A standards perspective on genomic data reusability and reproducibility DOI Creative Commons
Ishi Keenum, Scott A. Jackson, Emiley A. Eloe‐Fadrosh

et al.

Frontiers in Bioinformatics, Journal Year: 2025, Volume and Issue: 5

Published: March 10, 2025

Genomic and metagenomic sequence data provides an unprecedented ability to re-examine findings, offering a transformative potential for advancing research, developing computational tools, enhancing clinical applications, fostering scientific collaboration. However, effective ethical reuse of genomics is hampered by numerous technical social challenges. The International Microbiome Multi’Omics Standards Alliance (IMMSA, https://www.microbialstandards.org/ ) the Consortium (GSC, https://gensc.org hosted 5-part seminar series “A Year Data Reuse” in 2024 explore challenges opportunities reproducibility across disparate domains genomic sciences. Addressing these will require multifaceted approach, including common metadata reporting, clear communication, standardized protocols, improved management infrastructure, guidelines, collaborative policies that prioritize transparency accessibility. We offer strategies enable responsible technically feasible reuse, recognition challenges, emphasizing importance cross-disciplinary efforts pursuit open science data-driven innovation.

Language: Английский

Citations

0

Integrating patient metadata and pathogen genomic data: advancing pandemic preparedness with a multi-parametric simulator DOI Creative Commons

Bonjean Maxime,

Jérôme Ambroise, Francisco Orchard

et al.

BMC Research Notes, Journal Year: 2025, Volume and Issue: 18(1)

Published: April 15, 2025

Stakeholder training is essential for handling unexpected crises swiftly, safely, and effectively. Functional tabletop exercises simulate potential public health using complex scenarios with realistic data. These are designed by integrating datasets that represent populations exposed to a pandemic pathogen, combining pathogen genomic data generated through high-throughput sequencing (HTS) together patient epidemiological, clinical, demographic information. However, sharing between EU member states faces challenges due disparities in collection practices, standardisation, legal frameworks, privacy, security regulations, resource allocation. In the Horizon 2020 PANDEM-2 project, we developed multi-parametric tool links metadata, enabling managers enhance customise more accurate simulations. The available as an R package: https://github.com/maous1/Pandem2simulator Shiny application: https://uclouvain-ctma.Shinyapps.io/Multi-parametricSimulator/ , facilitating rapid scenario A structured procedure, complete video tutorials exercises, was shown be effective user-friendly during session twenty participants. conclusion, this enhances pandemics preparedness contextual metadata into increased realism of these significantly improves emergency responder readiness, regardless biological incident's nature, whether natural, accidental, or intentional.

Language: Английский

Citations

0

Shortcomings of SARS-CoV-2 genomic metadata DOI Creative Commons
Landen Gozashti, Russell Corbett‐Detig

BMC Research Notes, Journal Year: 2021, Volume and Issue: 14(1)

Published: May 17, 2021

Abstract Objective The SARS-CoV-2 pandemic has prompted one of the most extensive and expeditious genomic sequencing efforts in history. Each viral genome is accompanied by a set metadata which supplies important information such as geographic origin sample, age host, lab at sample was sequenced, integral to epidemiological public health direction. Here, we interrogate some shortcomings within GISAID database raise awareness common errors inconsistencies that may affect data-driven analyses provide possible avenues for resolutions. Results Our analysis reveals startling prevalence spelling inconsistent naming conventions, together occur an estimated ~ 9.8% 11.6% “originating lab” “submitting entries respectively. We also find numerous ambiguous very little about actual source could easily associate with multiple sources worldwide. Importantly, all these issues can impair ability accuracy association studies deceptively causing group samples identify when they truly source, or vice versa.

Language: Английский

Citations

27