Database,
Journal Year:
2025,
Volume and Issue:
2025
Published: Jan. 1, 2025
Abstract
Advances
in
agricultural
genetic,
genomic,
and
breeding
(GGB)
technologies
generate
increasingly
large
complex
datasets
that
need
to
be
adequately
managed
shared.
While
several
biological
databases
maintain
curate
GGB
data,
not
all
scientists
are
aware
of
them
how
they
can
used
access
share
data.
In
addition,
there
is
the
increase
scientists’
awareness
appropriate
data
archiving
curation
increases
longevity
value
bolsters
scientific
discoveries’
reproducibility
transparency.
The
AgBioData
Education
working
group
aims
address
these
unmet
needs
developed
a
modular
curriculum
for
educators
teaching
basics
findable,
accessible,
interoperable,
reusable
(FAIR)
principles
undergraduate
graduate
students
(https://www.agbiodata.org/).
present
paper
provides
an
overview
topics
covered
within
curriculum,
called
‘AgBioData
Curriculum
Ag
FAIR
Data,’
its
audience
modalities,
it
will
positively
impact
different
stakeholders
database
ecosystem.
We
hope
presented
here
help
understand
support
use
aspects
improving
our
global
food
system.
Database
URL:
https://zenodo.org/records/14278084
Briefings in Bioinformatics,
Journal Year:
2020,
Volume and Issue:
22(2), P. 616 - 630
Published: Oct. 8, 2020
Various
next
generation
sequencing
(NGS)
based
strategies
have
been
successfully
used
in
the
recent
past
for
tracing
origins
and
understanding
evolution
of
infectious
agents,
investigating
spread
transmission
chains
outbreaks,
as
well
facilitating
development
effective
rapid
molecular
diagnostic
tests
contributing
to
hunt
treatments
vaccines.
The
ongoing
COVID-19
pandemic
poses
one
greatest
global
threats
modern
history
has
already
caused
severe
social
economic
costs.
efficient
methods
reconstruct
genomic
sequence
SARS-CoV-2,
etiological
agent
COVID-19,
fundamental
design
devise
measures
mitigate
diffusion
pandemic.
Diverse
approaches
can,
testified
by
number
available
sequences,
be
applied
SARS-CoV-2
genomes.
However,
each
technology
approach
its
own
advantages
limitations.
In
current
review,
we
will
provide
a
brief,
but
hopefully
comprehensive,
account
currently
platforms
methodological
We
also
present
an
outline
repositories
databases
that
access
data
associated
metadata.
Finally,
offer
general
advice
guidelines
appropriate
sharing
deposition
metadata,
suggest
more
standardized
integration
future
SARS-CoV-2-related
would
greatly
facilitate
struggle
against
this
new
pathogen.
hope
our
'vademecum'
production
handling
data,
contribute
objective.
Nucleic Acids Research,
Journal Year:
2020,
Volume and Issue:
49(D1), P. D723 - D733
Published: Oct. 19, 2020
Abstract
The
Genomes
OnLine
Database
(GOLD)
(https://gold.jgi.doe.gov/)
is
a
manually
curated,
daily
updated
collection
of
genome
projects
and
their
metadata
accumulated
from
around
the
world.
current
version
database
includes
over
1.17
million
entries
organized
broadly
into
Studies
(45
770),
Organisms
(387
382)
or
Biosamples
(101
207),
Sequencing
Projects
(355
364)
Analysis
(283
481).
These
four
levels
contain
600
fields,
which
76
controlled
vocabulary
(CV)
tables
containing
3873
terms.
GOLD
provides
an
interactive
web
user
interface
for
browsing
searching
by
wide
range
project
fields.
Users
can
enter
details
about
own
in
GOLD,
acts
as
gatekeeper
to
ensure
that
accurately
documented
before
submitting
sequence
information
Integrated
Microbial
(IMG)
system
analysis.
In
order
maintain
reference
dataset
use
members
scientific
community,
also
imports
public
repositories
such
GenBank
SRA.
status
database,
along
with
recent
updates
improvements
are
described
this
manuscript.
Data Science,
Journal Year:
2022,
Volume and Issue:
5(2), P. 97 - 138
Published: Jan. 4, 2022
An
increasing
number
of
researchers
support
reproducibility
by
including
pointers
to
and
descriptions
datasets,
software
methods
in
their
publications.
However,
scientific
articles
may
be
ambiguous,
incomplete
difficult
process
automated
systems.
In
this
paper
we
introduce
RO-Crate,
an
open,
community-driven,
lightweight
approach
packaging
research
artefacts
along
with
metadata
a
machine
readable
manner.
RO-Crate
is
based
on
Schema$.$org
annotations
JSON-LD,
aiming
establish
best
practices
formally
describe
accessible
practical
way
for
use
wide
variety
situations.
structured
archive
all
the
items
that
contributed
outcome,
identifiers,
provenance,
relations
annotations.
As
general
purpose
data
metadata,
used
across
multiple
areas,
bioinformatics,
digital
humanities
regulatory
sciences.
By
applying
"just
enough"
Linked
Data
standards,
simplifies
making
outputs
FAIR
while
also
enhancing
reproducibility.
article
available
at
https://w3id.org/ro/doi/10.5281/zenodo.5146227
Frontiers in Medicine,
Journal Year:
2022,
Volume and Issue:
8
Published: Jan. 25, 2022
A
main
goal
of
Precision
Medicine
is
that
incorporating
and
integrating
the
vast
corpora
on
different
databases
about
molecular
environmental
origins
disease,
into
analytic
frameworks,
allowing
development
individualized,
context-dependent
diagnostics,
therapeutic
approaches.
In
this
regard,
artificial
intelligence
machine
learning
approaches
can
be
used
to
build
analytical
models
complex
disease
aimed
at
prediction
personalized
health
conditions
outcomes.
Such
must
handle
wide
heterogeneity
individuals
in
both
their
genetic
predisposition
social
determinants.
Computational
medicine
need
able
efficiently
manage,
visualize
integrate,
large
datasets
combining
structure,
unstructured
formats.
This
needs
done
while
constrained
by
levels
confidentiality,
ideally
doing
so
within
a
unified
architecture.
Efficient
data
integration
management
key
successful
application
computational
medicine.
number
challenges
arise
design
designs
medical
analytics
under
currently
demanding
performance
medicine,
also
subject
time,
power,
bioethical
constraints.
Here,
we
will
review
some
these
constraints
discuss
possible
avenues
overcome
current
challenges.
Nucleic Acids Research,
Journal Year:
2022,
Volume and Issue:
51(D1), P. D957 - D963
Published: Oct. 16, 2022
Abstract
The
Genomes
OnLine
Database
(GOLD)
(https://gold.jgi.doe.gov/)
at
the
Department
of
Energy
Joint
Genome
Institute
(DOE-JGI)
continues
to
maintain
its
role
as
one
flagship
genomic
metadata
repositories
world.
ever-increasing
number
projects
and
are
freely
available
user
community
world-wide.
GOLD’s
is
consumed
by
scientists
remains
an
important
source
for
large-scale
comparative
genomics
analysis
initiatives.
Encouraged
this
active
engagement
growth,
GOLD
has
continued
add
new
components
capabilities.
features
such
a
public
Application
Programming
Interface
(API)
Ecosystem
landing
page
well
growth
different
entities
in
current
v.9
edition
described
detail
manuscript.
Briefings in Bioinformatics,
Journal Year:
2020,
Volume and Issue:
22(2), P. 642 - 663
Published: Oct. 30, 2020
Abstract
SARS-CoV-2
(severe
acute
respiratory
syndrome
coronavirus
2)
is
a
novel
virus
of
the
family
Coronaviridae.
The
causes
infectious
disease
COVID-19.
biology
coronaviruses
has
been
studied
for
many
years.
However,
bioinformatics
tools
designed
explicitly
have
only
recently
developed
as
rapid
reaction
to
need
fast
detection,
understanding
and
treatment
To
control
ongoing
COVID-19
pandemic,
it
utmost
importance
get
insight
into
evolution
pathogenesis
virus.
In
this
review,
we
cover
workflows
routine
detection
infection,
reliable
analysis
sequencing
data,
tracking
pandemic
evaluation
containment
measures,
study
evolution,
discovery
potential
drug
targets
development
therapeutic
strategies.
For
each
tool,
briefly
describe
its
use
case
how
advances
research
specifically
SARS-CoV-2.
All
are
free
available
online,
either
through
web
applications
or
public
code
repositories.
Contact:[email protected]
Conservation Biology,
Journal Year:
2023,
Volume and Issue:
37(4)
Published: Jan. 27, 2023
Abstract
Genetic
diversity
within
species
represents
a
fundamental
yet
underappreciated
level
of
biodiversity.
Because
genetic
can
indicate
resilience
to
changing
climate,
its
measurement
is
relevant
many
national
and
global
conservation
policy
targets.
Many
studies
produce
large
amounts
genome‐scale
data
for
wild
populations,
but
most
(87%)
do
not
include
the
associated
spatial
temporal
metadata
necessary
them
be
reused
in
monitoring
programs
or
acknowledging
sovereignty
nations
Indigenous
peoples.
We
undertook
distributed
datathon
quantify
availability
these
missing
test
hypothesis
that
their
decays
with
time.
also
worked
remediate
by
extracting
from
published
papers,
online
repositories,
direct
communication
authors.
Starting
848
candidate
genomic
sets
(reduced
representation
whole
genome)
International
Nucleotide
Sequence
Database
Collaboration,
we
determined
561
contained
mostly
samples
populations.
successfully
restored
spatiotemporal
78%
(
n
=
440
on
45,105
individuals
762
17
phyla).
Examining
papers
repositories
was
much
more
fruitful
than
contacting
351
authors,
who
replied
our
email
requests
45%
Overall,
23%
queries
authors
unearthed
useful
metadata.
The
probability
retrieving
declined
significantly
as
age
set
increased.
There
13.5%
yearly
decrease
up
22%
were
only
available
This
rapid
decay
availability,
mirrored
other
types
biological
data,
should
motivate
swift
updates
data‐sharing
policies
researcher
practices
ensure
valuable
context
provided
lost
science
forever.
Journal of Biomedical Semantics,
Journal Year:
2021,
Volume and Issue:
12(1)
Published: July 18, 2021
Abstract
Background
Effective
response
to
public
health
emergencies,
such
as
we
are
now
experiencing
with
COVID-19,
requires
data
sharing
across
multiple
disciplines
and
systems.
Ontologies
offer
a
powerful
tool,
this
holds
especially
for
those
ontologies
built
on
the
design
principles
of
Open
Biomedical
Foundry.
These
exemplified
by
Infectious
Disease
Ontology
(IDO),
suite
interoperable
ontology
modules
aiming
provide
coverage
all
aspects
infectious
disease
domain.
At
its
center
is
IDO
Core,
disease-
pathogen-neutral
covering
just
types
entities
relations
that
relevant
diseases
generally.
Core
extended
pathogen-specific
modules.
Results
To
assist
integration
analysis
COVID-19
data,
viral
more
generally,
have
recently
developed
three
new
extensions:
Virus
(VIDO);
Coronavirus
(CIDO);
an
extension
CIDO
focusing
(IDO-COVID-19).
Reflecting
fact
viruses
lack
cellular
parts,
introduced
into
term
acellular
structure
cover
other
studied
virologists.
We
distinguish
between
agents
–
organisms
disposition
structures
disposition.
This
in
turn
has
led
various
updates
refinements
Core’s
content.
believe
our
work
VIDO,
CIDO,
IDO-COVID-19
can
serve
model
yielding
greater
conformance
building
best
practices.
Conclusions
provides
simple
recipe
way
allows
about
novel
be
easily
compared,
along
dimensions,
represented
existing
ontologies.
The
strategy,
moreover,
supports
coordination,
providing
method
physicians,
researchers,
organizations
respond
rapidly
efficiently
current
future
crises.