bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Sept. 17, 2023
Multimodal,
single-cell
genomics
technologies
enable
simultaneous
capture
of
multiple
facets
DNA
and
RNA
processing
in
the
cell.
This
creates
opportunities
for
transcriptome-wide,
mechanistic
studies
cellular
heterogeneous
cell
types,
with
applications
ranging
from
inferring
kinetic
differences
between
cells,
to
role
stochasticity
driving
heterogeneity.
However,
current
methods
determining
types
or
'clusters'
present
multimodal
data
often
rely
on
ad
hoc
independent
treatment
modalities,
assumptions
ignoring
inherent
properties
count
data.
To
interpretable
consistent
cluster
determination
data,
we
meK-Means
(mechanistic
K-Means)
which
integrates
modalities
learns
underlying,
shared
biophysical
states
through
a
unifying
model
transcription.
In
particular,
demonstrate
how
can
be
used
cells
unspliced
spliced
mRNA
modalities.
By
utilizing
causal,
physical
relationships
underlying
these
identify
transcriptional
kinetics
across
induce
observed
gene
expression
profiles,
provide
an
alternative
definition
governing
parameters
processes.
PLoS Computational Biology,
Journal Year:
2023,
Volume and Issue:
19(8), P. e1011288 - e1011288
Published: Aug. 17, 2023
Dimensionality
reduction
is
standard
practice
for
filtering
noise
and
identifying
relevant
features
in
large-scale
data
analyses.
In
biology,
single-cell
genomics
studies
typically
begin
with
to
2
or
3
dimensions
produce
"all-in-one"
visuals
of
the
that
are
amenable
human
eye,
these
subsequently
used
qualitative
quantitative
exploratory
analysis.
However,
there
little
theoretical
support
this
practice,
we
show
extreme
dimension
reduction,
from
hundreds
thousands
2,
inevitably
induces
significant
distortion
high-dimensional
datasets.
We
therefore
examine
practical
implications
low-dimensional
embedding
find
extensive
distortions
inconsistent
practices
make
such
embeddings
counter-productive
exploratory,
biological
lieu
this,
discuss
alternative
approaches
conducting
targeted
feature
exploration
enable
hypothesis-driven
discovery.
PLoS Computational Biology,
Journal Year:
2022,
Volume and Issue:
18(9), P. e1010492 - e1010492
Published: Sept. 12, 2022
We
perform
a
thorough
analysis
of
RNA
velocity
methods,
with
view
towards
understanding
the
suitability
various
assumptions
underlying
popular
implementations.
In
addition
to
providing
self-contained
exposition
mathematics,
we
undertake
simulations
and
controlled
experiments
on
biological
datasets
assess
workflow
sensitivity
parameter
choices
biology.
Finally,
argue
for
more
rigorous
approach
velocity,
present
framework
Markovian
that
points
directions
improvement
mitigation
current
problems.
Nature Communications,
Journal Year:
2022,
Volume and Issue:
13(1)
Published: Dec. 9, 2022
The
question
of
how
cell-to-cell
differences
in
transcription
rate
affect
RNA
count
distributions
is
fundamental
for
understanding
biological
processes
underlying
transcription.
Answering
this
requires
quantitative
models
that
are
both
interpretable
(describing
concrete
biophysical
phenomena)
and
tractable
(amenable
to
mathematical
analysis).
This
enables
the
identification
experiments
which
best
discriminate
between
competing
hypotheses.
As
a
proof
principle,
we
introduce
simple
but
flexible
class
involving
continuous
stochastic
driving
discrete
splicing
process,
compare
contrast
two
biologically
plausible
hypotheses
about
variation.
One
assumes
variation
due
DNA
experiencing
mechanical
strain,
while
other
it
regulator
number
fluctuations.
We
framework
numerically
analytically
studying
such
models,
apply
Bayesian
model
selection
identify
candidate
genes
show
signatures
each
single-cell
transcriptomic
data
from
mouse
glutamatergic
neurons.
PLoS ONE,
Journal Year:
2023,
Volume and Issue:
18(5), P. e0285674 - e0285674
Published: May 11, 2023
Metabarcoding
is
a
powerful
molecular
tool
for
simultaneously
surveying
hundreds
to
thousands
of
species
from
single
sample,
underpinning
microbiome
and
environmental
DNA
(eDNA)
methods.
Deriving
quantitative
estimates
underlying
biological
communities
metabarcoding
critical
enhancing
the
utility
such
approaches
health
conservation.
Recent
work
has
demonstrated
that
correcting
amplification
biases
in
genetic
data
can
yield
template
concentrations.
However,
major
source
uncertainty
stems
non-detections
across
technical
PCR
replicates
where
one
replicate
fails
detect
observed
other
replicates.
Such
are
special
case
variability
among
data.
While
many
sampling
processes
underlie
variation
data,
understanding
causes
an
important
step
distinguishing
signal
noise
studies.
Here,
we
use
both
simulated
empirical
1)
suggest
how
may
arise
2)
outline
steps
recognize
uninformative
practice,
3)
identify
conditions
under
which
amplicon
sequence
reliably
signals.
We
show
with
simulations
that,
given
species,
rate
function
concentration
species-specific
efficiency.
Consequently,
conclude
datasets
strongly
affected
by
(1)
deterministic
during
(2)
stochastic
amplicons
sequencing-both
model-but
also
(3)
rare
molecules
prior
PCR,
remains
frontier
metabarcoding.
Our
results
highlight
importance
estimating
efficiencies
critically
evaluating
patterns
non-detection
better
distinguish
inherent
detections
targets.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2023,
Volume and Issue:
unknown
Published: Nov. 22, 2023
Abstract
The
term
“RNA-seq”
refers
to
a
collection
of
assays
based
on
sequencing
experiments
that
involve
quantifying
RNA
species
from
bulk
tissue,
single
cells,
or
nuclei.
kallisto,
bustools,
and
kb-python
programs
are
free,
open-source
software
tools
for
performing
this
analysis
together
can
produce
gene
expression
quantification
raw
reads.
quantifications
be
individualized
multiple
samples,
both.
Additionally,
these
allow
values
classified
as
originating
nascent
mature
species,
making
workflow
amenable
both
cell-based
nucleus-based
assays.
This
protocol
describes
in
detail
how
use
kallisto
bustools
conjunction
with
wrapper,
kb-python,
preprocess
RNA-seq
data.