bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: June 11, 2023
Abstract
In
a
previous
paper
[CSIAM
Trans.
Appl.
Math.
2
(2021),
1-55],
the
authors
proposed
theoretical
framework
for
analysis
of
RNA
velocity,
which
is
promising
concept
in
scRNA-seq
data
to
reveal
cell
state-transition
dynamical
processes
underlying
snapshot
data.
The
current
devoted
algorithmic
study
some
key
components
velocity
workflow.
Four
important
points
are
addressed
this
paper:
(1)
We
construct
rational
time-scale
fixation
method
can
determine
global
gene-shared
latent
time
cells.
(2)
present
an
uncertainty
quantification
strategy
inferred
parameters
obtained
through
EM
algorithm.
(3)
establish
optimal
criterion
choice
kernel
bandwidth
with
respect
sample
size
downstream
and
discuss
its
implications.
(4)
propose
temporal
distance
estimation
approach
between
two
clusters
along
cellular
development
path.
Some
illustrative
numerical
tests
also
carried
out
verify
our
analysis.
These
results
intended
provide
tools
insights
further
type
methods
future.
PLoS Computational Biology,
Journal Year:
2022,
Volume and Issue:
18(9), P. e1010492 - e1010492
Published: Sept. 12, 2022
We
perform
a
thorough
analysis
of
RNA
velocity
methods,
with
view
towards
understanding
the
suitability
various
assumptions
underlying
popular
implementations.
In
addition
to
providing
self-contained
exposition
mathematics,
we
undertake
simulations
and
controlled
experiments
on
biological
datasets
assess
workflow
sensitivity
parameter
choices
biology.
Finally,
argue
for
more
rigorous
approach
velocity,
present
framework
Markovian
that
points
directions
improvement
mitigation
current
problems.
Nature Communications,
Journal Year:
2022,
Volume and Issue:
13(1)
Published: Dec. 9, 2022
The
question
of
how
cell-to-cell
differences
in
transcription
rate
affect
RNA
count
distributions
is
fundamental
for
understanding
biological
processes
underlying
transcription.
Answering
this
requires
quantitative
models
that
are
both
interpretable
(describing
concrete
biophysical
phenomena)
and
tractable
(amenable
to
mathematical
analysis).
This
enables
the
identification
experiments
which
best
discriminate
between
competing
hypotheses.
As
a
proof
principle,
we
introduce
simple
but
flexible
class
involving
continuous
stochastic
driving
discrete
splicing
process,
compare
contrast
two
biologically
plausible
hypotheses
about
variation.
One
assumes
variation
due
DNA
experiencing
mechanical
strain,
while
other
it
regulator
number
fluctuations.
We
framework
numerically
analytically
studying
such
models,
apply
Bayesian
model
selection
identify
candidate
genes
show
signatures
each
single-cell
transcriptomic
data
from
mouse
glutamatergic
neurons.
Nucleic Acids Research,
Journal Year:
2022,
Volume and Issue:
51(1), P. 68 - 83
Published: Dec. 30, 2022
Gene
expression
in
mammalian
cells
is
highly
variable
and
episodic,
resulting
a
series
of
discontinuous
bursts
mRNAs.
A
challenge
to
understand
how
static
promoter
architecture
dynamic
feedback
regulations
dictate
bursting
on
genome-wide
scale.
Although
single-cell
RNA
sequencing
(scRNA-seq)
provides
an
opportunity
address
this
challenge,
effective
analytical
methods
are
scarce.
We
developed
interpretable
scalable
inference
framework,
which
combined
experimental
data
with
mechanistic
model
infer
transcriptional
burst
kinetics
(sizes
frequencies)
regulations.
Applying
framework
scRNA-seq
generated
from
embryonic
mouse
fibroblast
cells,
we
found
Simpson's
paradoxes,
i.e.
exhibit
different
characteristics
two
cases
without
distinguishing
also
showed
that
feedbacks
differently
modulate
frequencies
sizes
conceal
the
effects
transcription
start
site
distributions
kinetics.
Notably,
only
presence
positive
feedback,
TATA
genes
expressed
high
enhancer-promoter
interactions
mainly
frequencies.
The
method
provided
flexible
efficient
way
investigate
obtained
results
would
be
helpful
for
understanding
cell
development
fate
decision.
Computational and Structural Biotechnology Journal,
Journal Year:
2023,
Volume and Issue:
21, P. 2373 - 2380
Published: Jan. 1, 2023
Single-cell
sequencing
technologies
have
revolutionised
the
life
sciences
and
biomedical
research.
provides
high-resolution
data
on
cell
heterogeneity,
allowing
high-fidelity
type
identification,
lineage
tracking.
Computational
algorithms
mathematical
models
been
developed
to
make
sense
of
data,
compensate
for
errors
simulate
biological
processes,
which
has
led
breakthroughs
in
our
understanding
differentiation,
cell-fate
determination
tissue
composition.
The
development
long-read
(a.k.a.
third-generation)
produced
powerful
tools
investigating
alternative
splicing,
isoform
expression
(at
RNA
level),
genome
assembly
detection
complex
structural
variants
DNA
level).In
this
review,
we
provide
an
overview
recent
advancements
single-cell
technologies,
with
a
particular
focus
computational
that
help
correcting,
analysing,
interpreting
resulting
data.
Additionally,
review
some
use
study
respectively.
Moreover,
highlight
emerging
opportunities
modelling
result
from
combination
technologies.
Royal Society Open Science,
Journal Year:
2023,
Volume and Issue:
10(4)
Published: April 1, 2023
Gene
expression
has
inherent
stochasticity
resulting
from
transcription's
burst
manners.
Single-cell
snapshot
data
can
be
exploited
to
rigorously
infer
transcriptional
kinetics,
using
mathematical
models
as
blueprints.
The
classical
telegraph
model
(CTM)
been
widely
used
explain
bursting
with
Markovian
assumptions.
However,
growing
evidence
suggests
that
the
gene-state
dwell
times
are
generally
non-exponential,
switching
is
a
multi-step
process
in
organisms.
Therefore,
interpretable
non-Markovian
and
efficient
statistical
inference
methods
urgently
required
investigating
kinetics.
We
develop
an
tractable
model,
generalized
(GTM),
characterize
allows
arbitrary
dwell-time
distributions,
rather
than
exponential
incorporated
into
ON
OFF
process.
Based
on
GTM,
we
propose
method
for
kinetics
approximate
Bayesian
computation
framework.
This
demonstrates
scalable
estimation
of
frequency
size
synthetic
data.
Further,
application
genome-wide
mouse
embryonic
fibroblasts
reveals
GTM
would
estimate
lower
higher
those
estimated
by
CTM.
In
conclusion,
corresponding
effective
tools
dynamic
static
single-cell
Biophysical Reports,
Journal Year:
2022,
Volume and Issue:
3(1), P. 100097 - 100097
Published: Dec. 27, 2022
Single-cell
RNA
sequencing
data
can
be
modeled
using
Markov
chains
to
yield
genome-wide
insights
into
transcriptional
physics.
However,
quantitative
inference
with
such
requires
careful
assessment
of
noise
sources.
We
find
that
long
pre-mRNA
transcripts
are
over-represented
in
data.
To
explain
this
trend,
we
propose
a
length-based
model
capture
bias,
which
may
produce
false-positive
observations.
solve
and
use
it
concordant
parameter
trends
as
well
systematic,
mechanistically
interpretable
technical
biological
differences
paired
sets.
Biophysical Journal,
Journal Year:
2023,
Volume and Issue:
123(1), P. 4 - 30
Published: Oct. 27, 2023
The
snapshot
distribution
of
mRNA
counts
per
cell
can
be
measured
using
single-molecule
fluorescence
in
situ
hybridization
or
single-cell
RNA
sequencing.
These
distributions
are
often
fit
to
the
steady-state
two-state
telegraph
model
estimate
three
transcriptional
parameters
for
a
gene
interest:
synthesis
rate,
switching
on
rate
(the
state
being
active
state),
and
off
rate.
This
assumes
no
extrinsic
noise,
i.e.,
do
not
vary
between
cells,
thus
estimated
understood
as
approximating
average
values
population.
accuracy
this
approximation
is
currently
unclear.
Here,
we
develop
theory
that
explains
size
sign
estimation
bias
when
inferring
from
data
standard
model.
We
find
specific
signatures
depending
source
noise
(which
parameter
most
variable
across
cells)
mode
activity.
If
expression
bursty
then
population
averages
all
overestimated
if
rate;
underestimation
occurs
both
overestimation
occur
some
tend
infinity
approaches
critical
threshold.
In
contrast
bursty,
cases
mean
burst
(ratio
rate)
while
frequency
underestimated.
covariance
matrix
sequencing
use
together
with
our
correct
published
estimates
mammalian
genes.
Biophysical Journal,
Journal Year:
2024,
Volume and Issue:
123(9), P. 1034 - 1057
Published: April 9, 2024
Stochastic
models
of
gene
expression
are
typically
formulated
using
the
chemical
master
equation,
which
can
be
solved
exactly
or
approximately
a
repertoire
analytical
methods.
Here,
we
provide
tutorial
review
an
alternative
approach
based
on
queueing
theory
that
has
rarely
been
used
in
literature
expression.
We
discuss
interpretation
six
types
infinite-server
queues
from
angle
stochastic
single-cell
biology
and
expressions
for
stationary
nonstationary
distributions
and/or
moments
mRNA/protein
numbers
bounds
Fano
factor.
This
may
enable
solution
complex
have
hitherto
evaded
solution.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Jan. 14, 2023
Abstract
We
motivate
and
present
biVI
,
which
combines
the
variational
autoencoder
framework
of
scVI
with
biophysically
motivated,
bivariate
models
for
nascent
mature
RNA
distributions.
While
previous
approaches
to
integrate
bimodal
data
via
ignore
causal
relationship
between
measurements,
biophysical
processes
that
give
rise
observations.
demonstrate
through
simulated
benchmarking
captures
cell
type
structure
in
a
low-dimensional
space
accurately
recapitulates
parameter
values
copy
number
On
biological
data,
provides
scalable
route
identifying
mechanisms
underlying
gene
expression.
This
analytical
approach
outlines
generalizable
strateg
treating
multimodal
datasets
generated
by
high-throughput,
single-cell
genomic
assays.