Genome Research,
Journal Year:
2023,
Volume and Issue:
33(9), P. 1622 - 1637
Published: Aug. 24, 2023
Bacterial
genomes
differ
in
both
gene
content
and
sequence
mutations,
which
underlie
extensive
phenotypic
diversity,
including
variation
susceptibility
to
antimicrobials
or
vaccine-induced
immunity.
To
identify
quantify
important
variants,
all
genes
within
a
population
must
be
predicted,
functionally
annotated,
clustered,
representing
the
“pangenome.”
Despite
volume
of
genome
data
available,
prediction
annotation
are
currently
conducted
isolation
on
individual
genomes,
is
computationally
inefficient
frequently
inconsistent
across
genomes.
Here,
we
introduce
open-source
software
graph-gene-caller
(ggCaller).
ggCaller
combines
prediction,
functional
annotation,
clustering
into
single
workflow
using
population-wide
de
Bruijn
graphs,
removing
redundancy
resulting
more
accurate
predictions
orthologue
clustering.
We
applied
simulated
real-world
bacterial
sets
containing
hundreds
thousands
comparing
it
current
state-of-the-art
tools.
has
considerable
speed-ups
with
equivalent
greater
accuracy,
particularly
complex
sources
error,
such
as
assembly
contamination
fragmentation.
also
an
extension
genome-wide
association
studies,
enabling
querying
annotated
graphs
for
analyses.
highlight
this
application
by
annotating
DNA
sequences
significant
associations
tetracycline
macrolide
resistance
Streptococcus
pneumoniae
,
identifying
key
determinants
that
were
missed
when
only
reference
genome.
novel
analysis
tool
applications
evolution
epidemiology.
Wellcome Open Research,
Journal Year:
2018,
Volume and Issue:
3, P. 33 - 33
Published: May 29, 2018
Background:
Phylogenetic
reconstruction
is
a
necessary
first
step
in
many
analyses
which
use
whole
genome
sequence
data
from
bacterial
populations.
There
are
available
methods
to
infer
phylogenies,
and
these
have
various
advantages
disadvantages,
but
few
unbiased
comparisons
of
the
range
approaches
been
made.Methods:
We
simulated
defined
'true
tree'
using
realistic
evolutionary
model.
built
phylogenies
this
methods,
compared
reconstructed
trees
true
tree
two
measures,
noting
computational
time
needed
for
different
phylogenetic
reconstructions.
also
used
real
Streptococcus
pneumoniae
alignments
compare
individual
core
gene
tree.Results:
found
that,
as
expected,
maximum
likelihood
good
quality
were
most
accurate,
computationally
intensive.
Using
less
accurate
we
able
obtain
results
comparable
accuracy;
that
approximate
can
rapidly
be
obtained
genetic
distance
based
methods.
In
highly
conserved
genes,
such
those
involved
translation,
gave
an
inaccurate
topology,
whereas
genes
recombination
events
branch
lengths.
show
tree-of-trees,
relating
reconstructions
each
other.Conclusions:
recommend
three
approaches,
depending
on
requirements
accuracy
time.
For
tree,
either
RAxML
or
IQ-TREE
with
alignment
variable
sites
produced
by
mapping
reference
best.
Quicker
do
not
perform
full
optimisation
may
useful
requiring
phylogeny,
generating
high
input
likely
major
limiting
factor
topology.
publicly
released
our
code
enable
further
comparisons.
Annual Review of Microbiology,
Journal Year:
2018,
Volume and Issue:
72(1), P. 521 - 549
Published: Sept. 8, 2018
Streptococcus
pneumoniae
(the
pneumococcus)
is
a
nasopharyngeal
commensal
and
respiratory
pathogen.
Most
isolates
express
capsule,
the
species-wide
diversity
of
which
has
been
immunologically
classified
into
∼100
serotypes.
Capsule
polysaccharides
have
combined
multivalent
vaccines
widely
used
in
adults,
but
T
cell
independence
antibody
response
means
they
are
not
protective
infants.
Polysaccharide
conjugate
(PCVs)
trigger
cell–dependent
through
attaching
carrier
protein
to
capsular
polysaccharides.
The
immune
stimulated
by
PCVs
infants
inhibits
carriage
vaccine
serotypes
(VTs),
resulting
population-wide
herd
immunity.
These
were
replaced
non-VTs.
Nevertheless,
drove
reductions
infant
pneumococcal
disease,
due
lower
mean
invasiveness
postvaccination
bacterial
population;
age-varying
serotype
resulted
smaller
reduction
adult
disease.
Alternative
being
tested
trials
designed
provide
protection
stimulating
innate
cellular
responses,
alongside
antibodies
conserved
antigens.
Microbial Genomics,
Journal Year:
2020,
Volume and Issue:
6(5)
Published: May 1, 2020
Knowledge
of
pneumococcal
lineages,
their
geographic
distribution
and
antibiotic
resistance
patterns,
can
give
insights
into
global
disease.
We
provide
interactive
bioinformatic
outputs
to
explore
such
topics,
aiming
increase
dissemination
genomic
the
wider
community,
without
need
for
specialist
training.
prepared
12
country-specific
phylogenetic
snapshots,
international
snapshots
73
common
Global
Pneumococcal
Sequence
Clusters
(GPSCs)
previously
defined
using
PopPUNK,
present
them
in
Microreact.
Gene
presence
absence
Roary,
recombination
profiles
derived
from
Gubbins
are
presented
Phandango
each
GPSC.
Temporal
signal
was
assessed
GPSC
BactDating.
examples
how
resources
be
used.
In
our
example
use
a
snapshot
we
determined
that
serotype
14
observed
nine
unrelated
genetic
backgrounds
South
Africa.
The
GPSC9,
which
most
isolates
Africa
were
observed,
highlights
there
three
independent
sub-clusters
represented
by
African
isolates.
estimated
GPSC9-dated
tree
established
during
1980s.
show
plots
allowed
identification
20
kb
spanning
capsular
polysaccharide
locus
within
GPSC97.
This
consistent
with
switch
6A
19A
have
occured
1990s
GPSC97-dated
tree.
Plots
gene
presence/absence
genes
(
tet
,
erm
cat
)
across
GPSC23
phylogeny
acquisition
composite
transposon.
GPSC23-dated
occurred
between
1953
1975.
Finally,
demonstrate
assignment
GPSC31
17
externally
generated
1
assemblies
Utah
via
Pathogenwatch.
Most
clustered
USA-specific
clade
recent
ancestor
1958
1981.
provided
used
data,
test
hypothesis
generate
new
hypotheses.
accessible
GPSCs
allows
others
contextualize
own
collections
beyond
data
here.
Microbial Genomics,
Journal Year:
2023,
Volume and Issue:
9(5)
Published: May 25, 2023
Horizontal
gene
transfer
(HGT)
and
the
resulting
patterns
of
gain
loss
are
a
fundamental
part
bacterial
evolution.
Investigating
these
can
help
us
to
understand
role
selection
in
evolution
pangenomes
how
bacteria
adapt
new
niche.
Predicting
presence
or
absence
genes
be
highly
error-prone
process
that
confound
efforts
dynamics
horizontal
transfer.
This
review
discusses
both
challenges
accurately
constructing
pangenome
potential
consequences
errors
have
on
downstream
analyses.
We
hope
by
summarizing
issues
researchers
will
able
avoid
pitfalls,
leading
improved
Genome Research,
Journal Year:
2023,
Volume and Issue:
33(9), P. 1622 - 1637
Published: Aug. 24, 2023
Bacterial
genomes
differ
in
both
gene
content
and
sequence
mutations,
which
underlie
extensive
phenotypic
diversity,
including
variation
susceptibility
to
antimicrobials
or
vaccine-induced
immunity.
To
identify
quantify
important
variants,
all
genes
within
a
population
must
be
predicted,
functionally
annotated,
clustered,
representing
the
“pangenome.”
Despite
volume
of
genome
data
available,
prediction
annotation
are
currently
conducted
isolation
on
individual
genomes,
is
computationally
inefficient
frequently
inconsistent
across
genomes.
Here,
we
introduce
open-source
software
graph-gene-caller
(ggCaller).
ggCaller
combines
prediction,
functional
annotation,
clustering
into
single
workflow
using
population-wide
de
Bruijn
graphs,
removing
redundancy
resulting
more
accurate
predictions
orthologue
clustering.
We
applied
simulated
real-world
bacterial
sets
containing
hundreds
thousands
comparing
it
current
state-of-the-art
tools.
has
considerable
speed-ups
with
equivalent
greater
accuracy,
particularly
complex
sources
error,
such
as
assembly
contamination
fragmentation.
also
an
extension
genome-wide
association
studies,
enabling
querying
annotated
graphs
for
analyses.
highlight
this
application
by
annotating
DNA
sequences
significant
associations
tetracycline
macrolide
resistance
Streptococcus
pneumoniae
,
identifying
key
determinants
that
were
missed
when
only
reference
genome.
novel
analysis
tool
applications
evolution
epidemiology.