BMC Bioinformatics,
Journal Year:
2023,
Volume and Issue:
24(1)
Published: Dec. 14, 2023
Abstract
Background
The
standardization
of
biological
data
using
unique
identifiers
is
vital
for
seamless
integration,
comprehensive
interpretation,
and
reproducibility
research
findings,
contributing
to
advancements
in
bioinformatics
systems
biology.
Despite
being
widely
accepted
as
a
universal
identifier,
scientific
names
species
have
inherent
limitations,
including
lack
stability,
uniqueness,
convertibility,
hindering
their
effective
use
databases,
particularly
natural
product
(NP)
occurrence
posing
substantial
obstacle
utilizing
this
valuable
large-scale
applications.
Result
To
address
these
challenges
facilitate
high-throughput
analysis
involving
names,
we
developed
PhyloSophos,
Python
package
that
considers
the
properties
taxonomic
accurately
map
name
inputs
entries
within
chosen
reference
database.
We
illustrate
importance
assessing
multiple
databases
considering
syntax-based
pre-processing
NP
an
example,
with
ultimate
goal
integrating
heterogeneous
information
into
single,
unified
dataset.
Conclusions
anticipate
PhyloSophos
significantly
aid
systematic
processing
poorly
digitized
curated
data,
such
biodiversity
ethnopharmacological
resources,
enabling
full-scale
resources.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: March 20, 2023
Abstract
Summary
The
nature
of
taxonomic
science
and
the
scientific
nomenclature
system
makes
it
difficult
to
use
names
as
identifiers
without
running
into
complications.
To
facilitate
high-throughput
analysis
biological
data
involving
names,
we
designed
PhyloSophos,
a
Python
package
that
takes
account
properties
systems
map
name
inputs
entries
within
reference
database
choice.
We
would
like
present
three
case-studies
which
demonstrates
how
our
implementations,
including
rule-based
pre-processing
recursive
mapping
could
improve
performance
information
availability.
expect
PhyloSophos
help
with
systematic
processing
poorly
digitized
curated
data,
such
biodiversity
ethnopharmacological
resources,
thus
enabling
full-scale
bioinformatics
using
these
data.
Availability
implementation
is
available
at
GitHub
https://github.com/mhcho4096/phylosophos
.
Supplementary
are
Bioinformatics
online.
BMC Bioinformatics,
Journal Year:
2023,
Volume and Issue:
24(1)
Published: Dec. 14, 2023
Abstract
Background
The
standardization
of
biological
data
using
unique
identifiers
is
vital
for
seamless
integration,
comprehensive
interpretation,
and
reproducibility
research
findings,
contributing
to
advancements
in
bioinformatics
systems
biology.
Despite
being
widely
accepted
as
a
universal
identifier,
scientific
names
species
have
inherent
limitations,
including
lack
stability,
uniqueness,
convertibility,
hindering
their
effective
use
databases,
particularly
natural
product
(NP)
occurrence
posing
substantial
obstacle
utilizing
this
valuable
large-scale
applications.
Result
To
address
these
challenges
facilitate
high-throughput
analysis
involving
names,
we
developed
PhyloSophos,
Python
package
that
considers
the
properties
taxonomic
accurately
map
name
inputs
entries
within
chosen
reference
database.
We
illustrate
importance
assessing
multiple
databases
considering
syntax-based
pre-processing
NP
an
example,
with
ultimate
goal
integrating
heterogeneous
information
into
single,
unified
dataset.
Conclusions
anticipate
PhyloSophos
significantly
aid
systematic
processing
poorly
digitized
curated
data,
such
biodiversity
ethnopharmacological
resources,
enabling
full-scale
resources.