bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Ноя. 29, 2023
Developing
a
universal
representation
of
cells
which
encompasses
the
tremendous
molecular
diversity
cell
types
within
human
body
and
more
generally,
across
species,
would
be
transformative
for
biology.
Recent
work
using
single-cell
transcriptomic
approaches
to
create
definitions
in
form
atlases
has
provided
necessary
data
such
an
endeavor.
Here,
we
present
Universal
Cell
Embedding
(UCE)
foundation
model.
UCE
was
trained
on
corpus
atlas
from
other
species
completely
self-supervised
way
without
any
annotations.
offers
unified
biological
latent
space
that
can
represent
cell,
regardless
tissue
or
species.
This
embedding
captures
important
variation
despite
presence
experimental
noise
diverse
datasets.
An
aspect
UCE's
universality
is
new
organism
mapped
this
with
no
additional
labeling,
model
training
fine-tuning.
We
applied
Integrated
Mega-scale
Atlas,
36
million
cells,
than
1,000
uniquely
named
types,
hundreds
experiments,
dozens
tissues
eight
uncovered
insights
about
organization
space,
leveraged
it
infer
function
newly
discovered
types.
exhibits
emergent
behavior,
uncovering
biology
never
explicitly
for,
as
identifying
developmental
lineages
novel
not
included
set.
Overall,
by
enabling
every
state
type,
provides
valuable
tool
analysis,
annotation
hypothesis
generation
scale
single
datasets
continues
grow.
Nature Biotechnology,
Год журнала:
2022,
Номер
40(10), С. 1458 - 1466
Опубликована: Май 2, 2022
Despite
the
emergence
of
experimental
methods
for
simultaneous
measurement
multiple
omics
modalities
in
single
cells,
most
single-cell
datasets
include
only
one
modality.
A
major
obstacle
integrating
data
from
is
that
different
layers
typically
have
distinct
feature
spaces.
Here,
we
propose
a
computational
framework
called
GLUE
(graph-linked
unified
embedding),
which
bridges
gap
by
modeling
regulatory
interactions
across
explicitly.
Systematic
benchmarking
demonstrated
more
accurate,
robust
and
scalable
than
state-of-the-art
tools
heterogeneous
multi-omics
data.
We
applied
to
various
challenging
tasks,
including
triple-omics
integration,
integrative
inference
human
cell
atlas
construction
over
millions
where
was
able
correct
previous
annotations.
features
modular
design
can
be
flexibly
extended
enhanced
new
analysis
tasks.
The
full
package
available
online
at
https://github.com/gao-lab/GLUE
.
Nature Communications,
Год журнала:
2021,
Номер
12(1)
Опубликована: Сен. 29, 2021
Abstract
Single-cell
RNA
sequencing
data
can
unveil
the
molecular
diversity
of
cell
types.
Cell
type
atlases
mouse
spinal
cord
have
been
published
in
recent
years
but
not
integrated
together.
Here,
we
generate
an
atlas
types
based
on
single-cell
transcriptomic
data,
unifying
available
datasets
into
a
common
reference
framework.
We
report
hierarchical
structure
postnatal
relationships,
with
location
providing
highest
level
organization,
then
neurotransmitter
status,
family,
and
finally,
dozens
refined
populations.
validate
combinatorial
marker
code
for
each
neuronal
map
their
spatial
distributions
adult
cord.
also
show
complex
lineage
relationships
among
Additionally,
develop
open-source
classifier,
SeqSeek,
to
facilitate
standardization
identification.
This
work
provides
view
types,
gene
expression
signatures,
organization.
Nature Methods,
Год журнала:
2023,
Номер
20(8), С. 1222 - 1231
Опубликована: Июнь 29, 2023
Jointly
profiling
the
transcriptome,
chromatin
accessibility
and
other
molecular
properties
of
single
cells
offers
a
powerful
way
to
study
cellular
diversity.
Here
we
present
MultiVI,
probabilistic
model
analyze
such
multiomic
data
leverage
it
enhance
single-modality
datasets.
MultiVI
creates
joint
representation
that
allows
an
analysis
all
modalities
included
in
input
data,
even
for
which
one
or
more
are
missing.
It
is
available
at
scvi-tools.org
.