Microbial Genomics,
Journal Year:
2024,
Volume and Issue:
10(2)
Published: Feb. 20, 2024
Viral
metagenomics
has
fuelled
a
rapid
change
in
our
understanding
of
global
viral
diversity
and
ecology.
Long-read
sequencing
hybrid
assembly
approaches
that
combine
long-
short-read
technologies
are
now
being
widely
implemented
bacterial
genomics
metagenomics.
However,
the
use
long-read
to
investigate
communities
is
still
its
infancy.
While
Nanopore
PacBio
have
been
applied
metagenomics,
it
not
known
what
extent
different
will
impact
reconstruction
community.
Thus,
we
constructed
mock
bacteriophage
community
previously
sequenced
phage
genomes
them
using
Illumina,
tested
number
approaches.
When
single
technology,
Illumina
assemblies
were
best
at
recovering
genomes.
Nanopore-
PacBio-only
performed
poorly
comparison
both
genome
recovery
error
rates,
which
varied
with
assembler
used.
The
had
errors
manifested
as
SNPs
INDELs
frequencies
41
157
%
higher
than
found
only
assemblies,
respectively.
12
78
Illumina-only
Despite
high-read
coverage,
long-read-only
recovered
maximum
one
complete
from
any
assembly,
unless
reads
down-sampled
prior
assembly.
Overall
approach
was
by
combination
reads,
reduced
rates
levels
comparable
short-read-only
assemblies.
approach.
differences
between
technology
downstream
impacts
on
gene
prediction,
subsequent
estimates
within
sample.
These
findings
provide
starting
point
for
others
choice
algorithms
analysis
viromes.
PHAGE,
Journal Year:
2021,
Volume and Issue:
2(4), P. 214 - 223
Published: Oct. 6, 2021
Background:
With
advances
in
sequencing
technology
and
decreasing
costs,
the
number
of
phage
genomes
that
have
been
sequenced
has
increased
markedly
past
decade.
Materials
Methods:
We
developed
an
automated
retrieval
analysis
system
for
(https://github.com/RyanCook94/inphared)
to
produce
INfrastructure
a
PHAge
REference
Database
(INPHARED)
associated
metadata.
Results:
As
January
2021,
14,244
complete
sequenced.
The
INPHARED
data
set
is
dominated
by
phages
infect
small
bacterial
genera,
with
75%
isolated
on
only
30
genera.
There
further
bias,
significantly
more
lytic
(∼70%)
than
temperate
(∼30%)
within
our
database.
Collectively,
this
results
∼54%
originating
from
just
three
host
much
debate
carriage
antibiotic
resistance
genes
their
potential
safety
therapy,
we
searched
putative
genes.
Frequency
gene
was
found
be
higher
again
varied
host.
Conclusions:
Given
bias
currently
genomes,
suggest
fully
understand
diversity,
efforts
should
made
isolate
sequence
larger
phages,
particular
greater
diversity
hosts.
Nucleic Acids Research,
Journal Year:
2022,
Volume and Issue:
51(D1), P. D733 - D743
Published: Nov. 18, 2022
Viruses
are
widely
recognized
as
critical
members
of
all
microbiomes.
Metagenomics
enables
large-scale
exploration
the
global
virosphere,
progressively
revealing
extensive
genomic
diversity
viruses
on
Earth
and
highlighting
myriad
ways
by
which
impact
biological
processes.
IMG/VR
provides
access
to
largest
collection
viral
sequences
obtained
from
(meta)genomes,
along
with
functional
annotation
rich
metadata.
A
web
interface
users
efficiently
browse
search
based
genome
features
and/or
sequence
similarity.
Here,
we
present
fourth
version
IMG/VR,
composed
>15
million
virus
genomes
fragments,
a
≈6-fold
increase
in
size
compared
previous
version.
These
clustered
into
8.7
operational
taxonomic
units,
including
231
408
at
least
one
high-quality
representative.
Viral
now
systematically
identified
genomes,
metagenomes,
metatranscriptomes
using
new
detection
approach
(geNomad),
IMG
standard
complemented
quality
estimation
CheckV,
classification
reflecting
latest
standards,
microbial
host
taxonomy
prediction.
v4
is
available
https://img.jgi.doe.gov/vr,
underlying
data
download
https://genome.jgi.doe.gov/portal/IMG_VR.
Cell,
Journal Year:
2022,
Volume and Issue:
185(21), P. 4023 - 4037.e18
Published: Sept. 28, 2022
High-throughput
RNA
sequencing
offers
broad
opportunities
to
explore
the
Earth
virome.
Mining
5,150
diverse
metatranscriptomes
uncovered
>2.5
million
virus
contigs.
Analysis
of
>330,000
RNA-dependent
polymerases
(RdRPs)
shows
that
this
expansion
corresponds
a
5-fold
increase
known
diversity.
Gene
content
analysis
revealed
multiple
protein
domains
previously
not
found
in
viruses
and
implicated
virus-host
interactions.
Extended
RdRP
phylogeny
supports
monophyly
five
established
phyla
reveals
two
putative
additional
bacteriophage
numerous
classes
orders.
The
dramatically
expanded
phylum
Lenarviricota,
consisting
bacterial
related
eukaryotic
viruses,
now
accounts
for
third
Identification
CRISPR
spacer
matches
bacteriolytic
proteins
suggests
subsets
picobirnaviruses
partitiviruses,
associated
with
eukaryotes,
infect
prokaryotic
hosts.
PLoS Biology,
Journal Year:
2023,
Volume and Issue:
21(4), P. e3002083 - e3002083
Published: April 21, 2023
The
extraordinary
diversity
of
viruses
infecting
bacteria
and
archaea
is
now
primarily
studied
through
metagenomics.
While
metagenomes
enable
high-throughput
exploration
the
viral
sequence
space,
metagenome-derived
sequences
lack
key
information
compared
to
isolated
viruses,
in
particular
host
association.
Different
computational
approaches
are
available
predict
host(s)
uncultivated
based
on
their
genome
sequences,
but
thus
far
individual
limited
either
precision
or
recall,
i.e.,
for
a
number
they
yield
erroneous
predictions
no
prediction
at
all.
Here,
we
describe
iPHoP,
two-step
framework
that
integrates
multiple
methods
reliably
taxonomy
genus
rank
broad
range
archaea,
while
retaining
low
false
discovery
rate.
Based
large
dataset
virus
genomes
from
IMG/VR
database,
illustrate
how
iPHoP
can
provide
extensive
guide
further
characterization
viruses.
Nature Biotechnology,
Journal Year:
2023,
Volume and Issue:
42(8), P. 1303 - 1312
Published: Sept. 21, 2023
Identifying
and
characterizing
mobile
genetic
elements
in
sequencing
data
is
essential
for
understanding
their
diversity,
ecology,
biotechnological
applications
impact
on
public
health.
Here
we
introduce
geNomad,
a
classification
annotation
framework
that
combines
information
from
gene
content
deep
neural
network
to
identify
sequences
of
plasmids
viruses.
geNomad
uses
dataset
more
than
200,000
marker
protein
profiles
provide
functional
taxonomic
assignment
viral
genomes.
Using
conditional
random
field
model,
also
detects
proviruses
integrated
into
host
genomes
with
high
precision.
In
benchmarks,
achieved
performance
diverse
viruses
(Matthews
correlation
coefficient
77.8%
95.3%,
respectively),
substantially
outperforming
other
tools.
Leveraging
geNomad's
speed
scalability,
processed
over
2.7
trillion
base
pairs
data,
leading
the
discovery
millions
are
available
through
IMG/VR
IMG/PR
databases.
at
https://portal.nersc.gov/genomad
.
PeerJ,
Journal Year:
2021,
Volume and Issue:
9, P. e11396 - e11396
Published: May 6, 2021
Bacteriophages
are
broadly
classified
into
two
distinct
lifestyles:
temperate
and
virulent.
Temperate
phages
capable
of
a
latent
phase
infection
within
host
cell
(lysogenic
cycle),
whereas
virulent
directly
replicate
lyse
cells
upon
(lytic
cycle).
Accurate
lifestyle
identification
is
critical
for
determining
the
role
individual
phage
species
ecosystems
their
effect
on
evolution.
Here,
we
present
BACPHLIP,
BACterioPHage
LIfestyle
Predictor.
BACPHLIP
detects
presence
set
conserved
protein
domains
an
input
genome
uses
this
data
to
predict
via
Random
Forest
classifier
that
was
trained
dataset
634
genomes.
On
independent
test
423
phages,
has
accuracy
98%
greatly
exceeding
previously
existing
tools
(79%).
freely
available
GitHub
(
https://github.com/adamhockenberry/bacphlip
)
code
used
build
provided
in
separate
repository
https://github.com/adamhockenberry/bacphlip-model-dev
users
wishing
interrogate
re-train
underlying
classification
model.
The ISME Journal,
Journal Year:
2021,
Volume and Issue:
15(8), P. 2366 - 2378
Published: March 1, 2021
Abstract
In
marine
ecosystems,
viruses
exert
control
on
the
composition
and
metabolism
of
microbial
communities,
influencing
overall
biogeochemical
cycling.
Deep
sea
sediments
associated
with
cold
seeps
are
known
to
host
taxonomically
diverse
but
little
is
about
infecting
these
microorganisms.
Here,
we
probed
metagenomes
from
seven
geographically
across
global
oceans
assess
viral
diversity,
virus–host
interaction,
virus-encoded
auxiliary
metabolic
genes
(AMGs).
Gene-sharing
network
comparisons
inhabiting
other
ecosystems
reveal
that
seep
harbour
considerable
unexplored
diversity.
Most
display
high
degrees
endemism
fluid
flux
being
one
main
drivers
community
composition.
silico
predictions
linked
14.2%
populations
many
belonging
poorly
understood
candidate
bacterial
archaeal
phyla.
Lysis
was
predicted
be
a
predominant
lifestyle
based
lineage-specific
virus/host
abundance
ratios.
Metabolic
prokaryotic
genomes
AMGs
suggest
influence
hydrocarbon
biodegradation
at
seeps,
as
well
carbon,
sulfur
nitrogen
cycling
via
virus-induced
mortality
and/or
augmentation.
Overall,
findings
diversity
biogeography
indicate
how
may
manipulate
ecology
biogeochemistry.