bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Ноя. 29, 2023
Developing
a
universal
representation
of
cells
which
encompasses
the
tremendous
molecular
diversity
cell
types
within
human
body
and
more
generally,
across
species,
would
be
transformative
for
biology.
Recent
work
using
single-cell
transcriptomic
approaches
to
create
definitions
in
form
atlases
has
provided
necessary
data
such
an
endeavor.
Here,
we
present
Universal
Cell
Embedding
(UCE)
foundation
model.
UCE
was
trained
on
corpus
atlas
from
other
species
completely
self-supervised
way
without
any
annotations.
offers
unified
biological
latent
space
that
can
represent
cell,
regardless
tissue
or
species.
This
embedding
captures
important
variation
despite
presence
experimental
noise
diverse
datasets.
An
aspect
UCE's
universality
is
new
organism
mapped
this
with
no
additional
labeling,
model
training
fine-tuning.
We
applied
Integrated
Mega-scale
Atlas,
36
million
cells,
than
1,000
uniquely
named
types,
hundreds
experiments,
dozens
tissues
eight
uncovered
insights
about
organization
space,
leveraged
it
infer
function
newly
discovered
types.
exhibits
emergent
behavior,
uncovering
biology
never
explicitly
for,
as
identifying
developmental
lineages
novel
not
included
set.
Overall,
by
enabling
every
state
type,
provides
valuable
tool
analysis,
annotation
hypothesis
generation
scale
single
datasets
continues
grow.
Advances
in
multi-omics
have
led
to
an
explosion
of
multimodal
datasets
address
questions
from
basic
biology
translation.
While
these
data
provide
novel
opportunities
for
discovery,
they
also
pose
management
and
analysis
challenges,
thus
motivating
the
development
tailored
computational
solutions.
Here,
we
present
a
standard
framework
multi-omics,
MUON,
designed
organise,
analyse,
visualise,
exchange
data.
MUON
stores
efficient
yet
flexible
interoperable
structure.
enables
versatile
range
analyses,
preprocessing
alignment.
Hepatology,
Год журнала:
2022,
Номер
76(4), С. 1219 - 1230
Опубликована: Фев. 17, 2022
Abstract
The
concept
of
hepatocyte
functional
zonation
is
well
established,
with
differences
in
metabolism
and
xenobiotic
processing
determined
by
multiple
factors
including
oxygen
nutrient
levels
across
the
hepatic
lobule.
However,
recent
advances
single‐cell
genomics
technologies,
nuclei
RNA
sequencing,
rapidly
evolving
fields
spatial
transcriptomic
proteomic
profiling
have
greatly
increased
our
understanding
liver
zonation.
Here
we
discuss
how
these
transformative
experimental
strategies
are
being
leveraged
to
dissect
at
unprecedented
resolution
this
new
information
should
facilitate
emergence
novel
precision
medicine‐based
therapies
for
patients
disease.
Nature Communications,
Год журнала:
2023,
Номер
14(1)
Опубликована: Фев. 21, 2023
Abstract
Single-cell
multi-omics
(scMulti-omics)
allows
the
quantification
of
multiple
modalities
simultaneously
to
capture
intricacy
complex
molecular
mechanisms
and
cellular
heterogeneity.
Existing
tools
cannot
effectively
infer
active
biological
networks
in
diverse
cell
types
response
these
external
stimuli.
Here
we
present
DeepMAPS
for
network
inference
from
scMulti-omics.
It
models
scMulti-omics
a
heterogeneous
graph
learns
relations
among
cells
genes
within
both
local
global
contexts
robust
manner
using
multi-head
transformer.
Benchmarking
results
indicate
performs
better
than
existing
clustering
construction.
also
showcases
competitive
capability
deriving
cell-type-specific
lung
tumor
leukocyte
CITE-seq
data
matched
diffuse
small
lymphocytic
lymphoma
scRNA-seq
scATAC-seq
data.
In
addition,
deploy
webserver
equipped
with
functionalities
visualizations
improve
usability
reproducibility
analysis.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2022,
Номер
unknown
Опубликована: Авг. 27, 2022
SUMMARY
Glioblastoma,
isocitrate
dehydrogenase
(IDH)-wildtype
(hereafter,
GB),
is
an
aggressive
brain
malignancy
associated
with
a
dismal
prognosis
and
poor
quality
of
life.
Single-cell
RNA
sequencing
has
helped
to
grasp
the
complexity
cell
states
dynamic
changes
in
GB.
Large-scale
data
integration
can
help
uncover
unexplored
tumor
pathobiology.
Here,
we
resolved
composition
milieu
created
cellular
map
GB
(‘GBmap’),
curated
resource
that
harmonizes
26
datasets
gathering
240
patients
spanning
over
1.1
million
cells.
We
showcase
applications
our
for
reference
mapping,
transfer
learning,
biological
discoveries.
Our
results
sources
pro-angiogenic
signaling
multifaceted
role
mesenchymal-like
cancer
Reconstructing
architecture
using
spatially
transcriptomics
unveiled
high
level
well-structured
neoplastic
niches.
The
GBmap
represents
framework
allows
streamlined
interpretation
new
provides
platform
exploratory
analysis,
hypothesis
generation
testing.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2022,
Номер
unknown
Опубликована: Март 11, 2022
ABSTRACT
Organ-
and
body-scale
cell
atlases
have
the
potential
to
transform
our
understanding
of
human
biology.
To
capture
variability
present
in
population,
these
must
include
diverse
demographics
such
as
age
ethnicity
from
both
healthy
diseased
individuals.
The
growth
size
number
single-cell
datasets,
combined
with
recent
advances
computational
techniques,
for
first
time
makes
it
possible
generate
comprehensive
large-scale
through
integration
multiple
datasets.
Here,
we
integrated
Human
Lung
Cell
Atlas
(HLCA)
combining
46
datasets
respiratory
system
into
a
single
atlas
spanning
over
2.2
million
cells
444
individuals
across
health
disease.
HLCA
contains
consensus
re-annotation
published
newly
generated
resolving
under-
or
misannotation
59%
original
enables
recovery
rare
types,
provides
marker
genes
each
type,
uncovers
gene
modules
associated
demographic
covariates
anatomical
location
within
system.
facilitate
use
reference
lung
research
allow
rapid
analysis
new
data,
provide
an
interactive
web
portal
project
onto
HLCA.
Finally,
demonstrate
value
interpreting
disease-associated
changes.
Thus,
outlines
roadmap
development
organ-scale
Atlas.
Genomics Proteomics & Bioinformatics,
Год журнала:
2022,
Номер
21(1), С. 24 - 47
Опубликована: Окт. 14, 2022
The
development
of
spatial
transcriptomics
(ST)
technologies
has
transformed
genetic
research
from
a
single-cell
data
level
to
two-dimensional
coordinate
system
and
facilitated
the
study
composition
function
various
cell
subsets
in
different
environments
organs.
large-scale
generated
by
these
ST
technologies,
which
contain
gene
expression
information,
have
elicited
need
for
spatially
resolved
approaches
meet
requirements
computational
biological
interpretation.
These
include
dealing
with
explosive
growth
determine
cell-level
gene-level
expression,
correcting
inner
batch
effect
loss
improve
quality,
conducting
efficient
interpretation
in-depth
knowledge
mining
both
at
tissue-wide
levels,
multi-omics
integration
analysis
provide
an
extensible
framework
toward
understanding
processes.
However,
algorithms
designed
specifically
are
still
their
infancy.
Here,
we
review
problems
light
corresponding
issues
challenges,
present
forward-looking
insights
into
algorithm
development.
bioRxiv (Cold Spring Harbor Laboratory),
Год журнала:
2023,
Номер
unknown
Опубликована: Ноя. 2, 2023
Abstract
Hundreds
of
millions
single
cells
have
been
analyzed
to
date
using
high
throughput
transcriptomic
methods,
thanks
technological
advances
driving
the
increasingly
rapid
generation
single-cell
data.
This
provides
an
exciting
opportunity
for
unlocking
new
insights
into
health
and
disease,
made
possible
by
meta-analysis
that
span
diverse
datasets
building
on
recent
in
large
language
models
other
machine
learning
approaches.
Despite
promise
these
emerging
analytical
tools
analyzing
amounts
data,
a
major
challenge
remains
sheer
number
inconsistent
format,
data
accessibility.
Many
are
available
via
unique
portals
platforms
often
lack
interoperability.
Here,
we
present
CZ
CellxGene
Discover
(
cellxgene.cziscience.com
),
platform
curated
interoperable
resource,
free-to-use
online
portal,
hosts
growing
corpus
community
contributed
spans
more
than
50
million
cells.
Curated,
standardized,
associated
with
consistent
cell-level
metadata,
this
collection
is
largest
its
kind.
A
suite
features
enables
accessibility
reusability
both
computational
visual
interfaces
allow
researchers
rapidly
explore
individual
perform
cross-corpus
analysis.
functionality
enabling
meta-analyses
tens
across
studies
tissues
providing
global
views
human
at
resolution