Nature Methods,
Journal Year:
2023,
Volume and Issue:
20(8), P. 1222 - 1231
Published: June 29, 2023
Jointly
profiling
the
transcriptome,
chromatin
accessibility
and
other
molecular
properties
of
single
cells
offers
a
powerful
way
to
study
cellular
diversity.
Here
we
present
MultiVI,
probabilistic
model
analyze
such
multiomic
data
leverage
it
enhance
single-modality
datasets.
MultiVI
creates
joint
representation
that
allows
an
analysis
all
modalities
included
in
input
data,
even
for
which
one
or
more
are
missing.
It
is
available
at
scvi-tools.org
.
Genome biology,
Journal Year:
2021,
Volume and Issue:
22(1)
Published: Jan. 4, 2021
Distinguishing
biological
from
technical
variation
is
crucial
when
integrating
and
comparing
single-cell
genomics
datasets
across
different
experiments.
Existing
methods
lack
the
capability
in
explicitly
distinguishing
these
two
variations,
often
leading
to
removal
of
both
variations.
Here,
we
present
an
integration
method
scMC
remove
while
preserving
intrinsic
variation.
learns
via
variance
analysis
subtract
inferred
unsupervised
manner.
Application
simulated
real
RNA-seq
ATAC-seq
experiments
demonstrates
its
detecting
context-shared
context-specific
signals
accurate
alignment.
Nature,
Journal Year:
2023,
Volume and Issue:
624(7991), P. 317 - 332
Published: Dec. 13, 2023
The
mammalian
brain
consists
of
millions
to
billions
cells
that
are
organized
into
many
cell
types
with
specific
spatial
distribution
patterns
and
structural
functional
properties1-3.
Here
we
report
a
comprehensive
high-resolution
transcriptomic
cell-type
atlas
for
the
whole
adult
mouse
brain.
was
created
by
combining
single-cell
RNA-sequencing
(scRNA-seq)
dataset
around
7
million
profiled
(approximately
4.0
passing
quality
control),
approximately
4.3
using
multiplexed
error-robust
fluorescence
in
situ
hybridization
(MERFISH).
is
hierarchically
4
nested
levels
classification:
34
classes,
338
subclasses,
1,201
supertypes
5,322
clusters.
We
present
an
online
platform,
Allen
Brain
Cell
Atlas,
visualize
whole-brain
along
MERFISH
datasets.
systematically
analysed
neuronal
non-neuronal
across
identified
high
degree
correspondence
between
identity
specificity
each
type.
results
reveal
unique
features
organization
different
regions-in
particular,
dichotomy
dorsal
ventral
parts
part
contains
relatively
fewer
yet
highly
divergent
types,
whereas
more
numerous
closely
related
other.
Our
study
also
uncovered
extraordinary
diversity
heterogeneity
neurotransmitter
neuropeptide
expression
co-expression
types.
Finally,
found
transcription
factors
major
determinants
classification
combinatorial
factor
code
defines
all
establishes
benchmark
reference
foundational
resource
integrative
investigations
cellular
circuit
function,
development
evolution
Molecular Systems Biology,
Journal Year:
2021,
Volume and Issue:
17(1)
Published: Jan. 1, 2021
As
the
number
of
single-cell
transcriptomics
datasets
grows,
natural
next
step
is
to
integrate
accumulating
data
achieve
a
common
ontology
cell
types
and
states.
However,
it
not
straightforward
compare
gene
expression
levels
across
automatically
assign
type
labels
in
new
dataset
based
on
existing
annotations.
In
this
manuscript,
we
demonstrate
that
our
previously
developed
method,
scVI,
provides
an
effective
fully
probabilistic
approach
for
joint
representation
analysis
scRNA-seq
data,
while
accounting
uncertainty
caused
by
biological
measurement
noise.
We
also
introduce
ANnotation
using
Variational
Inference
(scANVI),
semi-supervised
variant
scVI
designed
leverage
state
scANVI
favorably
state-of-the-art
methods
integration
annotation
terms
accuracy,
scalability,
adaptability
challenging
settings.
contrast
methods,
multiple
with
single
generative
model
can
be
directly
used
downstream
tasks,
such
as
differential
expression.
Both
are
easily
accessible
through
scvi-tools.
Nature Biotechnology,
Journal Year:
2021,
Volume and Issue:
40(1), P. 121 - 130
Published: Aug. 30, 2021
Abstract
Large
single-cell
atlases
are
now
routinely
generated
to
serve
as
references
for
analysis
of
smaller-scale
studies.
Yet
learning
from
reference
data
is
complicated
by
batch
effects
between
datasets,
limited
availability
computational
resources
and
sharing
restrictions
on
raw
data.
Here
we
introduce
a
deep
strategy
mapping
query
datasets
top
called
architectural
surgery
(scArches).
scArches
uses
transfer
parameter
optimization
enable
efficient,
decentralized,
iterative
building
contextualization
new
with
existing
without
Using
examples
mouse
brain,
pancreas,
immune
whole-organism
atlases,
show
that
preserves
biological
state
information
while
removing
effects,
despite
using
four
orders
magnitude
fewer
parameters
than
de
novo
integration.
generalizes
multimodal
mapping,
allowing
imputation
missing
modalities.
Finally,
retains
coronavirus
disease
2019
(COVID-19)
variation
when
healthy
reference,
enabling
the
discovery
disease-specific
cell
states.
will
facilitate
collaborative
projects
construction,
updating,
efficient
use
atlases.
Nature Medicine,
Journal Year:
2023,
Volume and Issue:
29(6), P. 1563 - 1577
Published: June 1, 2023
Single-cell
technologies
have
transformed
our
understanding
of
human
tissues.
Yet,
studies
typically
capture
only
a
limited
number
donors
and
disagree
on
cell
type
definitions.
Integrating
many
single-cell
datasets
can
address
these
limitations
individual
the
variability
present
in
population.
Here
we
integrated
Human
Lung
Cell
Atlas
(HLCA),
combining
49
respiratory
system
into
single
atlas
spanning
over
2.4
million
cells
from
486
individuals.
The
HLCA
presents
consensus
re-annotation
with
matching
marker
genes,
including
annotations
rare
previously
undescribed
types.
Leveraging
diversity
individuals
HLCA,
identify
gene
modules
that
are
associated
demographic
covariates
such
as
age,
sex
body
mass
index,
well
changing
expression
along
proximal-to-distal
axis
bronchial
tree.
Mapping
new
data
to
enables
rapid
annotation
interpretation.
Using
reference
for
study
disease,
shared
states
across
multiple
lung
diseases,
SPP1
Nature,
Journal Year:
2023,
Volume and Issue:
619(7970), P. 585 - 594
Published: July 19, 2023
Abstract
Understanding
kidney
disease
relies
on
defining
the
complexity
of
cell
types
and
states,
their
associated
molecular
profiles
interactions
within
tissue
neighbourhoods
1
.
Here
we
applied
multiple
single-cell
single-nucleus
assays
(>400,000
nuclei
or
cells)
spatial
imaging
technologies
to
a
broad
spectrum
healthy
reference
kidneys
(45
donors)
diseased
(48
patients).
This
has
provided
high-resolution
cellular
atlas
51
main
types,
which
include
rare
previously
undescribed
populations.
The
multi-omic
approach
provides
detailed
transcriptomic
profiles,
regulatory
factors
localizations
spanning
entire
kidney.
We
also
define
28
states
across
nephron
segments
interstitium
that
were
altered
in
injury,
encompassing
cycling,
adaptive
(successful
maladaptive
repair),
transitioning
degenerative
states.
Molecular
signatures
permitted
localization
these
injury
using
transcriptomics,
while
large-scale
3D
analysis
(around
1.2
million
neighbourhoods)
corresponding
linkages
active
immune
responses.
These
analyses
defined
biological
pathways
are
relevant
time-course
niches,
including
underlying
epithelial
repair
predicted
with
decline
function.
integrated
multimodal
human
represents
comprehensive
benchmark
neighbourhoods,
outcome-associated
publicly
available
interactive
visualizations.
Nature Biotechnology,
Journal Year:
2022,
Volume and Issue:
40(10), P. 1458 - 1466
Published: May 2, 2022
Despite
the
emergence
of
experimental
methods
for
simultaneous
measurement
multiple
omics
modalities
in
single
cells,
most
single-cell
datasets
include
only
one
modality.
A
major
obstacle
integrating
data
from
is
that
different
layers
typically
have
distinct
feature
spaces.
Here,
we
propose
a
computational
framework
called
GLUE
(graph-linked
unified
embedding),
which
bridges
gap
by
modeling
regulatory
interactions
across
explicitly.
Systematic
benchmarking
demonstrated
more
accurate,
robust
and
scalable
than
state-of-the-art
tools
heterogeneous
multi-omics
data.
We
applied
to
various
challenging
tasks,
including
triple-omics
integration,
integrative
inference
human
cell
atlas
construction
over
millions
where
was
able
correct
previous
annotations.
features
modular
design
can
be
flexibly
extended
enhanced
new
analysis
tasks.
The
full
package
available
online
at
https://github.com/gao-lab/GLUE
.