bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Feb. 8, 2024
Abstract
Public
health
researchers
and
practitioners
commonly
infer
phylogenies
from
viral
genome
sequences
to
understand
transmission
dynamics
identify
clusters
of
genetically-related
samples.
However,
viruses
that
reassort
or
recombine
violate
phylogenetic
assumptions
require
more
sophisticated
methods.
Even
when
are
appropriate,
they
can
be
unnecessary
difficult
interpret
without
specialty
knowledge.
For
example,
pairwise
distances
between
enough
related
samples
assign
new
existing
clusters.
In
this
work,
we
tested
whether
dimensionality
reduction
methods
could
capture
known
genetic
groups
within
two
human
pathogenic
cause
substantial
morbidity
mortality
frequently
recombine,
respectively:
seasonal
influenza
A/H3N2
SARS-CoV-2.
We
applied
principal
component
analysis
(PCA),
multidimensional
scaling
(MDS),
t-distributed
stochastic
neighbor
embedding
(t-SNE),
uniform
manifold
approximation
projection
(UMAP)
with
well-defined
clades
either
reassortment
(H3N2)
recombination
(SARS-CoV-2).
each
low-dimensional
sequences,
calculated
the
correlation
Euclidean
in
a
hierarchical
clustering
method
embedding.
measured
accuracy
compared
previously
defined
clades,
clusters,
recombinant
lineages.
found
MDS
embeddings
accurately
represented
including
intermediate
placement
SARS-CoV-2
lineages
parental
Clusters
t-SNE
recapitulated
H3N2
groups,
show
simple
statistical
biological
model
represent
relationships
for
relevant
viruses.
Our
open
source
implementation
these
easily
inappropriate.
Cell,
Journal Year:
2024,
Volume and Issue:
187(6), P. 1374 - 1386.e13
Published: Feb. 29, 2024
The
World
Health
Organization
declared
mpox
a
public
health
emergency
of
international
concern
in
July
2022.
To
investigate
global
transmission
and
population-level
changes
associated
with
controlling
spread,
we
built
phylogeographic
phylodynamic
models
to
analyze
MPXV
genomes
from
five
regions
together
air
traffic
epidemiological
data.
Our
reveal
community
prior
detection,
case
reporting
throughout
the
epidemic,
large
degree
heterogeneity.
We
find
that
viral
introductions
played
limited
role
prolonging
spread
after
initial
dissemination,
suggesting
travel
bans
would
have
had
only
minor
impact.
North
America
began
declining
before
more
than
10%
high-risk
individuals
USA
vaccine-induced
immunity.
findings
highlight
importance
broader
routine
specimen
screening
surveillance
for
emerging
infectious
diseases
joint
integration
genomic
information
early
outbreak
control.
Clinical Chemistry,
Journal Year:
2025,
Volume and Issue:
71(1), P. 192 - 202
Published: Jan. 1, 2025
Institutions
of
higher
education
(IHE)
have
been
a
focus
SARS-CoV-2
transmission
studies
but
there
is
limited
information
on
how
viral
diversity
and
at
IHE
changed
as
the
pandemic
progressed.
Nature,
Journal Year:
2025,
Volume and Issue:
unknown
Published: March 5, 2025
Pathogen
genomics
can
provide
insights
into
underlying
infectious
disease
transmission
patterns1,2,
but
new
methods
are
needed
to
handle
modern
large-scale
pathogen
genome
datasets
and
realize
this
full
potential3-5.
In
particular,
genetically
proximal
viruses
should
be
highly
informative
about
events
as
genetic
proximity
indicates
epidemiological
linkage.
Here
we
use
pairs
of
identical
sequences
characterize
fine-scale
patterns
using
114,298
SARS-CoV-2
genomes
collected
through
Washington
State
(USA)
genomic
sentinel
surveillance
with
associated
age
residence
location
information
between
March
2021
December
2022.
This
corresponds
59,660
another
sequence
in
the
dataset.
We
find
that
is
consistent
expectations
from
mobility
social
contact
data.
Outliers
relationship
data
explained
by
postcodes
male
prisons,
prison
facilities.
groups
vary
across
spatial
scales.
Finally,
timing
collection
understand
driving
transmission.
Overall,
study
improves
our
ability
large
determinants
spread.
PLoS Computational Biology,
Journal Year:
2025,
Volume and Issue:
21(4), P. e1012960 - e1012960
Published: April 15, 2025
The
wealth
of
genomic
data
that
was
generated
during
the
COVID-19
pandemic
provides
an
exceptional
opportunity
to
obtain
information
on
transmission
SARS-CoV-2.
Specifically,
there
is
great
interest
better
understand
how
effective
reproduction
number
Re
and
overdispersion
secondary
cases,
which
can
be
quantified
by
negative
binomial
dispersion
parameter
k
,
changed
over
time
across
regions
viral
variants.
aim
our
study
develop
a
Bayesian
framework
infer
id="m7">Re
from
sequence
data.
First,
we
developed
mathematical
model
for
distribution
size
identical
clusters,
in
integrated
transmission,
mutation
rate
virus,
incomplete
case-detection.
Second,
implemented
this
within
inference
framework,
allowing
estimation
id="m8">Re
only.
We
validated
simulation
study.
Third,
identified
clusters
sequences
all
SARS-CoV-2
2021
Switzerland,
Denmark,
Germany
were
available
GISAID.
obtained
monthly
estimates
posterior
id="m9">Re
with
resulting
id="m10">Re
slightly
lower
than
other
methods,
comparable
previous
results.
found
comparatively
higher
Denmark
suggests
less
opportunities
superspreading
more
controlled
compared
countries
2021.
Our
included
case
detection
sampling
probability,
but
had
large
uncertainty,
reflecting
difficulty
estimating
these
parameters
simultaneously.
presents
novel
method
infectious
diseases
its
heterogeneity
using
With
increasing
availability
pathogens
future,
expect
has
potential
provide
new
insights
into
cases
pathogens.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: May 25, 2024
Abstract
Pathogen
genomics
can
provide
insights
into
underlying
infectious
disease
transmission
patterns,
but
new
methods
are
needed
to
handle
modern
large-scale
pathogen
genome
datasets
and
realize
this
full
potential.
In
particular,
genetically
proximal
viruses
should
be
highly
informative
about
events
as
genetic
proximity
indicates
epidemiological
linkage.
Here,
we
leverage
pairs
of
identical
sequences
characterise
fine-scale
patterns
using
114,298
SARS-CoV-2
genomes
collected
through
Washington
State
(USA)
genomic
sentinel
surveillance
with
associated
age
residence
location
information
between
March
2021
December
2022.
This
corresponds
59,660
another
sequence
in
the
dataset.
We
find
that
is
consistent
expectations
from
mobility
social
contact
data.
Outliers
relationship
data
explained
by
postal
codes
male
prisons,
prison
facilities.
groups
vary
across
spatial
scales.
Finally,
use
timing
collection
understand
driving
transmission.
Overall,
work
improves
our
ability
large
determinants
spread.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: May 27, 2024
Abstract
The
wealth
of
genomic
data
that
was
generated
during
the
COVID-19
pandemic
provides
an
exceptional
opportunity
to
obtain
information
on
transmission
SARS-CoV-2.
Specifically,
there
is
great
interest
better
understand
how
effective
reproduction
number
R
e
and
overdispersion
secondary
cases,
which
can
be
quantified
by
negative
binomial
dispersion
parameter
k
,
changed
over
time
across
regions
viral
variants.
aim
our
study
develop
a
Bayesian
framework
infer
from
sequence
data.
First,
we
developed
mathematical
model
for
distribution
size
identical
clusters,
in
integrated
transmission,
mutation
rate
virus,
incomplete
case-detection.
Second,
implemented
this
within
inference
framework,
allowing
estimation
only.
We
validated
simulation
study.
Third,
identified
clusters
sequences
all
SARS-CoV-2
2021
Switzerland,
Denmark,
Germany
were
available
GISAID.
obtained
monthly
estimates
posterior
with
resulting
slightly
lower
than
other
methods,
comparable
previous
results.
found
comparatively
higher
Denmark
suggests
less
opportunities
superspreading
more
controlled
compared
countries
2021.
Our
included
case
detection
sampling
probability,
but
had
large
uncertainty,
reflecting
difficulty
estimating
these
parameters
simultaneously.
presents
novel
method
infectious
diseases
its
heterogeneity
using
With
increasing
availability
pathogens
future,
expect
has
potential
provide
new
insights
into
cases
pathogens.
Author
summary
Pathogen
stochastic
process
characterized
two
parameters:
relates
average
per
current
conditions
immunity,
captures
variability
cases.
While
estimated
well
data,
difficult
quantify
since
detailed
about
who
infected
whom
required.
Here,
took
advantage
enormous
identify
sequences,
providing
indirect
chains
at
different
times
pandemic,
thus
epidemic
parameters.
then
extended
previously
defined
estimate
probability
approach
simulated
real
three
countries,
compatible
estimates.
In
future
increased
pathogen
availability,
believe
will
pave
way
absence
contact
tracing
Molecular Biology and Evolution,
Journal Year:
2024,
Volume and Issue:
41(11)
Published: Nov. 1, 2024
Abstract
Phylodynamics
is
central
to
understanding
infectious
disease
dynamics
through
the
integration
of
genomic
and
epidemiological
data.
Despite
advancements,
including
application
deep
learning
overcome
computational
limitations,
significant
challenges
persist
due
data
inadequacies
statistical
unidentifiability
key
parameters.
These
issues
are
particularly
pronounced
in
poorly
resolved
phylogenies,
commonly
observed
outbreaks
such
as
SARS-CoV-2.
In
this
study,
we
conducted
a
thorough
evaluation
PhyloDeep,
inference
tool
for
phylodynamics,
assessing
its
performance
on
phylogenies.
Our
findings
reveal
limited
predictive
accuracy
PhyloDeep
(and
other
state-of-the-art
approaches)
these
scenarios.
However,
models
trained
resolved,
realistically
simulated
trees
demonstrate
improved
power,
despite
not
being
infallible,
especially
scenarios
with
superspreading
dynamics,
whose
parameters
challenging
capture
accurately.
Notably,
observe
markedly
minimal
contact
tracing
data,
which
refines
trees.
Applying
approach
sample
SARS-CoV-2
sequences
partially
matched
from
Hong
Kong
yields
informative
estimates
potential,
extending
beyond
scope
alone.
potential
enhancing
phylodynamic
analysis
complementary
integration,
ultimately
increasing
precision
predictions
crucial
public
health
decision-making
outbreak
control.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Aug. 2, 2023
Abstract
The
World
Health
Organization
declared
mpox
a
public
health
emergency
of
international
concern
in
July
2022.
To
investigate
global
transmission
and
population-level
changes
associated
with
controlling
spread,
we
built
phylogeographic
phylodynamic
models
to
analyze
MPXV
genomes
from
five
regions
together
air
traffic
epidemiological
data.
Our
reveal
community
prior
detection,
case-reporting
throughout
the
epidemic,
large
degree
heterogeneity.
We
find
that
viral
introductions
played
limited
role
prolonging
spread
after
initial
dissemination,
suggesting
travel
bans
would
have
had
only
minor
impact.
North
America
began
declining
before
more
than
10%
high-risk
individuals
USA
vaccine-induced
immunity.
findings
highlight
importance
broader
routine
specimen
screening
surveillance
for
emerging
infectious
diseases
joint
integration
genomic
information
early
outbreak
control.
medRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: June 11, 2024
Abstract
Deep
learning
has
emerged
as
a
powerful
tool
for
phylodynamic
analysis,
addressing
common
computational
limitations
affecting
existing
methods.
However,
notable
disparities
exist
between
simulated
phylogenetic
trees
used
training
deep
models
and
those
derived
from
real-world
sequence
data,
necessitating
thorough
examination
of
their
practicality.
We
conducted
comprehensive
evaluation
model
performance
by
assessing
an
inference
phylodynamics,
PhyloDeep,
against
realistic
characterized
SARS-CoV-2.
Our
study
reveals
the
poor
predictive
accuracy
PhyloDeep
trained
on
when
applied
to
data.
Conversely,
demonstrate
improved
predictions,
despite
not
being
infallible,
especially
in
scenarios
where
superspreading
dynamics
are
challenging
capture
accurately.
Consequently,
we
find
markedly
through
integration
minimal
contact
tracing
Applying
this
approach
sample
SARS-CoV-2
sequences
partially
matched
Hong
Kong
yields
informative
estimates
potential
beyond
scope
data
alone.
findings
enhancing
processing
low
resolution
complementary
integration,
ultimately
increasing
precision
epidemiological
predictions
crucial
public
health
decision
making
outbreak
control.