Abstract
Background
Droplet-based
single-cell
RNA
sequence
analyses
assume
that
all
acquired
RNAs
are
endogenous
to
cells.
However,
any
cell-free
contained
within
the
input
solution
also
captured
by
these
assays.
This
sequencing
of
constitutes
a
background
contamination
confounds
biological
interpretation
transcriptomic
data.
Results
We
demonstrate
from
this
"soup"
is
ubiquitous,
with
experiment-specific
variations
in
composition
and
magnitude.
present
method,
SoupX,
for
quantifying
extent
estimating
"background-corrected"
cell
expression
profiles
seamlessly
integrate
existing
downstream
analysis
tools.
Applying
method
several
datasets
using
multiple
droplet
technologies,
we
its
application
improves
otherwise
misleading
data,
as
well
improving
quality
control
metrics.
Conclusions
tool
removing
ambient
droplet-based
experiments.
has
broad
applicability,
can
improve
utility
future
datasets.
Scanpy
is
a
scalable
toolkit
for
analyzing
single-cell
gene
expression
data.
It
includes
methods
preprocessing,
visualization,
clustering,
pseudotime
and
trajectory
inference,
differential
testing,
simulation
of
regulatory
networks.
Its
Python-based
implementation
efficiently
deals
with
data
sets
more
than
one
million
cells
(
https://github.com/theislab/Scanpy
).
Along
Scanpy,
we
present
AnnData,
generic
class
handling
annotated
matrices
https://github.com/theislab/anndata
Nucleic Acids Research,
Год журнала:
2018,
Номер
47(D1), С. D766 - D773
Опубликована: Окт. 8, 2018
The
accurate
identification
and
description
of
the
genes
in
human
mouse
genomes
is
a
fundamental
requirement
for
high
quality
analysis
data
informing
both
genome
biology
clinical
genomics.
Over
last
15
years,
GENCODE
consortium
has
been
producing
reference
gene
annotations
to
provide
this
foundational
resource.
includes
experimental
computational
groups
who
work
together
improve
extend
annotation.
Specifically,
we
generate
primary
data,
create
bioinformatics
tools
support
expert
manual
annotators
automated
annotation
pipelines.
In
addition,
workflows
use
any
all
publicly
available
analysis,
along
with
research
literature
identify
characterise
loci
highest
standard.
are
accessible
via
Ensembl
UCSC
Genome
Browsers,
FTP
site,
Biomart,
Perl
REST
APIs
as
well
https://www.gencodegenes.org.
Cell,
Год журнала:
2018,
Номер
174(4), С. 999 - 1014.e22
Опубликована: Авг. 1, 2018
The
mammalian
nervous
system
executes
complex
behaviors
controlled
by
specialized,
precisely
positioned,
and
interacting
cell
types.
Here,
we
used
RNA
sequencing
of
half
a
million
single
cells
to
create
detailed
census
types
in
the
mouse
system.
We
mapped
spatially
derived
hierarchical,
data-driven
taxonomy.
Neurons
were
most
diverse
grouped
developmental
anatomical
units
expression
neurotransmitters
neuropeptides.
Neuronal
diversity
was
driven
genes
encoding
identity,
synaptic
connectivity,
neurotransmission,
membrane
conductance.
discovered
seven
distinct,
regionally
restricted
astrocyte
that
obeyed
boundaries
correlated
with
spatial
distribution
key
glutamate
glycine
neurotransmitters.
In
contrast,
oligodendrocytes
showed
loss
regional
identity
followed
secondary
diversification.
resource
presented
here
lays
solid
foundation
for
understanding
molecular
architecture
enables
genetic
manipulation
specific
Cell,
Год журнала:
2020,
Номер
181(5), С. 1016 - 1035.e19
Опубликована: Апрель 27, 2020
There
is
pressing
urgency
to
understand
the
pathogenesis
of
severe
acute
respiratory
syndrome
coronavirus
clade
2
(SARS-CoV-2),
which
causes
disease
COVID-19.
SARS-CoV-2
spike
(S)
protein
binds
angiotensin-converting
enzyme
(ACE2),
and
in
concert
with
host
proteases,
principally
transmembrane
serine
protease
(TMPRSS2),
promotes
cellular
entry.
The
cell
subsets
targeted
by
tissues
factors
that
regulate
ACE2
expression
remain
unknown.
Here,
we
leverage
human,
non-human
primate,
mouse
single-cell
RNA-sequencing
(scRNA-seq)
datasets
across
health
uncover
putative
targets
among
tissue-resident
subsets.
We
identify
TMPRSS2
co-expressing
cells
within
lung
type
II
pneumocytes,
ileal
absorptive
enterocytes,
nasal
goblet
secretory
cells.
Strikingly,
discovered
a
human
interferon-stimulated
gene
(ISG)
vitro
using
airway
epithelial
extend
our
findings
vivo
viral
infections.
Our
data
suggest
could
exploit
species-specific
interferon-driven
upregulation
ACE2,
tissue-protective
mediator
during
injury,
enhance
infection.
Journal of The Royal Society Interface,
Год журнала:
2018,
Номер
15(141), С. 20170387 - 20170387
Опубликована: Апрель 1, 2018
Deep
learning
describes
a
class
of
machine
algorithms
that
are
capable
combining
raw
inputs
into
layers
intermediate
features.
These
have
recently
shown
impressive
results
across
variety
domains.
Biology
and
medicine
data-rich
disciplines,
but
the
data
complex
often
ill-understood.
Hence,
deep
techniques
may
be
particularly
well
suited
to
solve
problems
these
fields.
We
examine
applications
biomedical
problems-patient
classification,
fundamental
biological
processes
treatment
patients-and
discuss
whether
will
able
transform
tasks
or
if
sphere
poses
unique
challenges.
Following
from
an
extensive
literature
review,
we
find
has
yet
revolutionize
biomedicine
definitively
resolve
any
most
pressing
challenges
in
field,
promising
advances
been
made
on
prior
state
art.
Even
though
improvements
over
previous
baselines
modest
general,
recent
progress
indicates
methods
provide
valuable
means
for
speeding
up
aiding
human
investigation.
Though
linking
specific
neural
network's
prediction
input
features,
understanding
how
users
should
interpret
models
make
testable
hypotheses
about
system
under
study
remains
open
challenge.
Furthermore,
limited
amount
labelled
training
presents
some
domains,
as
do
legal
privacy
constraints
work
with
sensitive
health
records.
Nonetheless,
foresee
enabling
changes
at
both
bench
bedside
potential
several
areas
biology
medicine.
Molecular Systems Biology,
Год журнала:
2019,
Номер
15(6)
Опубликована: Июнь 1, 2019
Single-cell
RNA-seq
has
enabled
gene
expression
to
be
studied
at
an
unprecedented
resolution.
The
promise
of
this
technology
is
attracting
a
growing
user
base
for
single-cell
analysis
methods.
As
more
tools
are
becoming
available,
it
increasingly
difficult
navigate
landscape
and
produce
up-to-date
workflow
analyse
one's
data.
Here,
we
detail
the
steps
typical
analysis,
including
pre-processing
(quality
control,
normalization,
data
correction,
feature
selection,
dimensionality
reduction)
cell-
gene-level
downstream
analysis.
We
formulate
current
best-practice
recommendations
these
based
on
independent
comparison
studies.
have
integrated
into
workflow,
which
apply
public
dataset
further
illustrate
how
work
in
practice.
Our
documented
case
study
can
found
https://www.github.com/theislab/single-cell-tutorial
This
review
will
serve
as
tutorial
new
entrants
field,
help
established
users
update
their
pipelines.