COSIME: Cooperative multi-view integration and Scalable and Interpretable Model Explainer
Jerome J. Choi,
No information about this author
Noah Cohen Kalafut,
No information about this author
Tim Gruenloh
No information about this author
et al.
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2025,
Volume and Issue:
unknown
Published: Jan. 14, 2025
Single-omics
approaches
often
provide
a
limited
view
of
complex
biological
systems,
whereas
multiomics
integration
offers
more
comprehensive
understanding
by
combining
diverse
data
views.
However,
integrating
heterogeneous
types
and
interpreting
the
intricate
relationships
between
features-both
within
across
different
views-remains
bottleneck.
To
address
these
challenges,
we
introduce
COSIME
(Cooperative
Multi-view
Integration
Scalable
Interpretable
Model
Explainer).
uses
backpropagation
Learnable
Optimal
Transport
(LOT)
to
deep
neural
networks,
enabling
learning
latent
features
from
multiple
views
predict
disease
phenotypes.
In
addition,
incorporates
Monte
Carlo
sampling
efficiently
estimate
Shapley
values
Shapley-Taylor
indices,
assessment
both
feature
importance
their
pairwise
interactions-synergistically
or
antagonistically-in
predicting
We
applied
simulated
real-world
datasets,
including
single-cell
transcriptomics,
spatial
epigenomics,
metabolomics,
specifically
for
Alzheimer's
disease-related
Our
results
demonstrate
that
significantly
improves
prediction
performance
while
offering
enhanced
interpretability
relationships.
For
example,
identified
synergistic
interactions
microglia
astrocyte
genes
associated
with
AD
are
likely
be
active
at
edges
middle
temporal
gyrus
as
indicated
locations.
Finally,
is
open-source
available
general
use.
Language: Английский
scCompass: An integrated cross-species scRNA-seq database for AI-ready
bioRxiv (Cold Spring Harbor Laboratory),
Journal Year:
2024,
Volume and Issue:
unknown
Published: Nov. 15, 2024
Abstract
Emerging
single-cell
sequencing
technology
has
generated
large
amounts
of
data,
allowing
analysis
cellular
dynamics
and
gene
regulation
at
the
resolution.
Advances
in
artificial
intelligence
enhance
life
sciences
research
by
delivering
critical
insights
optimizing
data
processes.
However,
inconsistent
processing
quality
standards
remain
to
be
a
major
challenge.
Here
we
propose
scCompass,
which
provides
solution
build
large-scale,
cross-species
model-friendly
collection.
By
applying
standardized
pre-processing,
scCompass
integrates
curates
transcriptomic
from
13
species
nearly
105
million
single
cells.
Using
this
extensive
dataset,
are
able
archieve
stable
expression
genes
(SEGs)
organ-specific
(OSGs)
human
mouse.
We
provide
different
scalable
datasets
that
can
easily
adapted
for
AI
model
training
pretrained
checkpoints
with
state-of-the-art
(SOTA)
foundataion
models.
In
summary,
AI-readiness
combined
user-friendly
sharing,
visualization
online
analysis,
greatly
simplifies
access
exploitation
researchers
cell
biology(
http://www.bdbe.cn/kun
).
Language: Английский