Psychiatry Research,
Journal Year:
2024,
Volume and Issue:
334, P. 115804 - 115804
Published: Feb. 18, 2024
•
MDD
has
substantial
changes
in
the
structure
and
function
of
gut
microbiota.
exhibited
decreased
amino
acids
bile
increased
lipids
blood.
The
blood
immune
cell
subtypes
tend
to
promote
inflammation.
could
be
divided
into
two
subtypes,
one
is
correlated
with
relapse.
We
revealed
integrative
discriminative
signatures
for
distinguishing
from
HC.
Major
depressive
disorder
(MDD)
involves
systemic
peripheral
microbiota,
but
current
understanding
incomplete.
Herein,
we
conducted
a
multi-omics
analysis
fecal
samples
obtained
an
observational
cohort
including
patients
(n
=
99)
healthy
control
(HC,
n
50).
16S
rRNA
sequencing
microbiota
showed
structural
alterations
MDD,
as
characterized
by
Enterococcus
.
Metagenomics
functional
upregulation
superpathway
glyoxylate
cycle
fatty
acid
degradation
downregulation
various
metabolic
pathways
MDD.
Plasma
metabolomics
acids,
together
sphingolipids
cholesterol
esters
Notably,
metabolites
involved
arginine
proline
metabolism
were
while
sphingolipid
pathway
increased.
Mass
cytometry
rises
proinflammatory
subsets
declines
anti-inflammatory
Furthermore,
our
findings
disease
severity-related
factors
Interestingly,
classified
that
highly
Moreover,
established
differentiate
These
contribute
comprehensive
pathogenesis
provide
valuable
resources
discovery
biomarkers.
Molecular Systems Biology,
Journal Year:
2018,
Volume and Issue:
14(6)
Published: June 1, 2018
Method20
June
2018Open
Access
Transparent
process
Multi-Omics
Factor
Analysis—a
framework
for
unsupervised
integration
of
multi-omics
data
sets
Ricard
Argelaguet
orcid.org/0000-0003-3199-3722
European
Molecular
Biology
Laboratory,
Bioinformatics
Institute,
Hinxton,
Cambridge,
UK
Search
more
papers
by
this
author
Britta
Velten
orcid.org/0000-0002-8397-3515
Laboratory
(EMBL),
Heidelberg,
Germany
Damien
Arnol
orcid.org/0000-0003-2462-534X
Sascha
Dietrich
orcid.org/0000-0002-0648-1832
Heidelberg
University
Hospital,
Thorsten
Zenz
orcid.org/0000-0001-7890-9845
German
Cancer
Research
Center
(dkfz)
and
National
Tumor
Diseases
(NCT),
&
Hematology,
Hospital
Zurich
Zurich,
Switzerland
John
C
Marioni
orcid.org/0000-0001-9092-0852
Cambridge
Wellcome
Trust
Sanger
Florian
Buettner
Corresponding
Author
[email
protected]
orcid.org/0000-0001-5587-6761
Helmholtz
Zentrum
München–German
Environmental
Health,
Institute
Computational
Biology,
Neuherberg,
Wolfgang
Huber
orcid.org/0000-0002-0474-2218
Oliver
Stegle
orcid.org/0000-0002-8818-7193
Information
Argelaguet1,‡,
Velten2,‡,
Arnol1,
Dietrich3,
Zenz3,4,5,
Marioni1,6,7,
*,1,8,
*,2
*,1,2
1European
2European
3Heidelberg
4German
5Germany
6Cancer
7Wellcome
8Helmholtz
‡These
authors
contributed
equally
to
work
*Corresponding
author.
Tel:
+49
89
23742560;
E-mail:
6221
387
8823;
3878190;
Systems
(2018)14:e8124https://doi.org/10.15252/msb.20178124
PDFDownload
PDF
article
text
main
figures.
Peer
ReviewDownload
a
summary
the
editorial
decision
including
letters,
reviewer
comments
responses
feedback.
ToolsAdd
favoritesDownload
CitationsTrack
CitationsPermissions
ShareFacebookTwitterLinked
InMendeleyWechatReddit
Figures
Info
Abstract
Multi-omics
studies
promise
improved
characterization
biological
processes
across
molecular
layers.
However,
methods
resulting
heterogeneous
are
lacking.
We
present
Analysis
(MOFA),
computational
method
discovering
principal
sources
variation
in
sets.
MOFA
infers
set
(hidden)
factors
that
capture
technical
variability.
It
disentangles
axes
heterogeneity
shared
multiple
modalities
those
specific
individual
modalities.
The
learnt
enable
variety
downstream
analyses,
identification
sample
subgroups,
imputation
detection
outlier
samples.
applied
cohort
200
patient
samples
chronic
lymphocytic
leukaemia,
profiled
somatic
mutations,
RNA
expression,
DNA
methylation
ex
vivo
drug
responses.
identified
major
dimensions
disease
heterogeneity,
immunoglobulin
heavy-chain
variable
region
status,
trisomy
chromosome
12
previously
underappreciated
drivers,
such
as
response
oxidative
stress.
In
second
application,
we
used
analyse
single-cell
data,
identifying
coordinated
transcriptional
epigenetic
changes
along
cell
differentiation.
Synopsis
(MOFA)
is
discovery
when
omics
assays
same
broadly
applicable
approach
integration.
inferred
latent
represent
underlying
Factors
can
be
or
data-type
specific.
model
flexibly
handles
missing
values
different
types.
an
application
Chronic
Lymphocytic
Leukaemia,
discovers
low
dimensional
space
spanned
known
clinical
markers
profiles
from
single-cells,
recovers
differentiation
trajectories
identifies
between
transcriptome
epigenome.
Introduction
Technological
advances
increasingly
layers
probed
parallel,
ranging
genome,
epigenome,
transcriptome,
proteome
metabolome
phenome
profiling
(Hasin
et
al,
2017).
Integrative
analyses
use
information
these
deliver
comprehensive
insights
into
systems
under
study.
Motivated
this,
domains,
cancer
biology
(Gerstung
2015;
Iorio
2016;
Mertins
Genome
Atlas
Network,
2017),
regulatory
genomics
(Chen
2016),
microbiology
(Kim
2016)
host-pathogen
(Soderholm
2016).
Most
recent
technological
have
also
enabled
performing
at
level
(Macaulay
Angermueller
Guo
2017;
Clark
2018;
Colomé-Tatché
Theis,
2018).
A
common
aim
applications
characterize
samples,
manifested
one
several
(Ritchie
2015).
particularly
appealing
if
relevant
not
priori,
hence
may
missed
consider
single
modality
targeted
approaches.
basic
strategy
testing
marginal
associations
prominent
example
quantitative
trait
locus
mapping,
where
large
numbers
association
tests
performed
genetic
variants
gene
expression
levels
(GTEx
Consortium,
2015)
marks
While
em-inently
useful
variant
annotation,
inherently
local
do
provide
coherent
global
map
differences
kernel-
graph-based
combine
types
similarity
network
(Lanckriet
2004;
Wang
2014);
however,
it
difficult
pinpoint
determinants
graph
structure.
Related
there
exist
generalizations
other
clustering
reconstruct
discrete
groups
based
on
(Shen
2009;
Mo
2013).
key
challenge
sufficiently
addressed
approaches
interpretability.
particular,
would
desirable
drive
observed
These
could
continuous
gradients,
clusters
combinations
thereof.
Such
help
establishing
explaining
with
external
phenotypes
covariates.
Although
factor
models
address
been
proposed
(e.g.
Meng
2014,
Tenenhaus
2014;
preprint:
Singh
2018),
either
lack
sparsity,
which
reduce
interpretability,
require
substantial
number
parameters
determined
using
computationally
demanding
cross-validation
post
hoc.
Further
challenges
faced
existing
scalability
larger
sets,
handling
non-Gaussian
modalities,
binary
readouts
count-based
traits.
Results
statistical
integrating
fashion.
Intuitively,
viewed
versatile
statistically
rigorous
generalization
component
analysis
(PCA)
data.
Given
matrices
measurements
partially
overlapping
interpretable
low-dimensional
representation
terms
(Fig
1A).
thus
facilitating
gradients
subgroups
loadings
sparse,
thereby
linkage
most
features.
Importantly,
what
extent
each
unique
1B),
revealing
Once
trained,
output
range
visualization,
classification
space(s)
factors,
well
automated
annotation
(gene
set)
enrichment
analysis,
1B).
Figure
1.
Analysis:
overview
Model
overview:
takes
M
input
(Y1,…,
YM),
modality,
co-occurrent
but
features
necessarily
related
differ
numbers.
decomposes
matrix
(Z)
weight
matrices,
(W1,..,
WM).
White
cells
correspond
zeros,
i.e.
inactive
features,
whereas
cross
symbol
denotes
values.
fitted
queried
(i)
variance
decomposition,
assessing
proportion
explained
(ii)
semi-automated
inspection
(iii)
visualization
(iv)
values,
assays.
Download
figure
PowerPoint
Technically,
builds
upon
group
(Virtanen
2012;
Khan
Klami
Bunte
Zhao
Leppäaho
Kaski,
adapted
requirements
(Materials
Methods):
fast
inference
variational
approximation,
sparse
solutions
interpretation,
efficient
flexible
combination
likelihood
enables
diverse
binary-,
count-
continuous-valued
relationship
previous
Virtanen
2013;
Remes
Hore
Leppáaho
2017)
discussed
Materials
Methods
Appendix
Table
S3.
implemented
well-documented
open-source
software
comes
tutorials
workflows
domains
Methods).
Taken
together,
functionalities
powerful
tool
disentangling
studies.
validation
comparison
simulated
First,
validate
MOFA,
its
generative
model,
varying
views,
models,
Methods,
S1).
found
was
able
accurately
dimension,
except
settings
high
proportions
(Appendix
Fig
account
observations
fit
simulating
count
Figs
S2
S3).
compared
two
reported
integration:
GFA
(Leppäaho
iCluster
(Mo
Over
simulations,
tended
infer
redundant
S4)
were
less
accurate
recovering
patterns
activity
views
S5).
than
EV1).
For
example,
training
CLL
next,
required
25
min
versus
34
h
5–6
days
iCluster.
Click
here
expand
figure.
EV1.
Scalability
iClusterTime
(red),
(blue)
(green)
function
K,
D,
N
M.
Baseline
=
3,
K
10,
D
1,000
100
5%
Shown
average
time
10
trials,
error
bars
denote
standard
deviation.
only
shown
lowest
all
training.
Application
leukaemia
study
(CLL),
combined
mutation
(Dietrich
2A).
Notably,
nearly
40%
some
types;
value
scenario
uncommon
studies,
designed
cope
Methods;
configured
order
accommodate
2.
A.
Study
Data
rows
(D
features)
(N)
columns,
grey
bars.
B,
C.
(B)
Proportion
total
(R2)
assay
(C)
cumulative
explained.
D.
Absolute
top
1
2
Mutations
E.
Visualization
colours
IGHV
status
tumours;
shape
colour
tone
indicate
status.
F.
Number
enriched
Reactome
per
(FDR
<
1%).
categories
pathways
defined
S2.
(minimum
2%
least
type;
robust
algorithm
initialization
subsampling
S6
S7).
largely
orthogonal,
capturing
independent
S6).
Among
these,
active
assays,
indicating
broad
roles
2B).
contrast,
3
5
4
only.
Cumulatively,
41%
38%
mRNA
24%
2C).
trained
excluding
probe
their
redundancy,
finding
still
recovered,
while
others
dependent
type
S8).
2013),
consistent
instances
S9).
important
reveals
axis
attributed
stress
As
part
pipeline,
provides
strategies
identify
aetiology
weights
aligned
(IGHV),
2D
E).
Thus,
correctly
them
(Zenz
2010;
Fabbri
Dalla-Favera,
marker
associated
1,
surrogate
state
tumour's
origin
activation
B-cell
receptor.
practice
generally
considered
(Fabbri
our
results
complex
substructure
3A,
S10).
At
current
resolution,
three
subgroup
Oakes
al
(2016)
Queiros
(2015)
S11),
although
suggestive
evidence
continuum.
connected
S12
S13),
genes
linked
(Vasconcelos
2005;
Maloum
Trojani
Morabito
Plesingerova
3B
C)
drugs
target
kinases
receptor
pathway
3D
3.
Characterization
Beeswarm
plot
corresponding
3-means
(LZ),
intermediate
(IZ)
(HZ).
largest
absolute
Plus
minus
symbols
right
sign
loading.
Genes
highlighted
orange
described
prognostic
Heatmap
(B).
weights,
annotated
category.
Drug
curves
stratified
(A).
Despite
importance,
accounted
20%
suggesting
existence
heterogeneity.
One
5,
revealed
tagged
senescence
(Figs
2F
EV2A),
heat-shock
proteins
(HSPs;
EV2B
C),
essential
protein
folding
up-regulated
conditions
(Srivastava,
2002;
Åkerfelt
2010).
HSP
cancers
tumour
survival
(Trachootham
2009),
far
family
has
received
little
attention
context
CLL.
Consistent
strongest
stress,
reactive
oxygen
species
(ROS),
damage
apoptosis
EV2D
EV2.
(oxidative
factor)
5.
Colours
TNF,
inflammatory
marker.
Gene
(t-test,
six
Samples
ordered
Scaled
loading,
captured
9%
suggested
aetiologies
immune
T-cell
signalling
2F),
likely
due
composition
samples:
comprised
mainly
B
cells,
possible
contamination
T
monocytes
S14).
11%
samples'
general
sensitivity
(Geeleher
S15).
imputes
Next,
explored
annotations,
missing,
mis-annotated
inaccurate,
since
they
frequently
imperfect
surrogates
(Westra
2011).
Since
biomarker
impacting
care,
assessed
consistency
176
out
patients,
agreement
further
allowed
classifying
patients
lacked
clinically
measured
EV3A
B).
Interestingly,
assigned
label.
Upon
nine
cases
showed
signatures,
borderline
classification;
remaining
clearly
discordant
EV3C
D).
Additional
whole
exome
sequencing
confirmed
outliers
within
EV3E
F).
EV3.
Prediction
denoting
predicted
labels
Pie
chart
showing
imputed
Sample-to-sample
correlation
ONO-4509
(not
included
data):
Boxplots
viability
ONO-4509.
middle;
left
right,
viabilities
M-CLL
U-CLL
shown,
respectively.
panels
show
concentrations
tested.
Boxes
first
third
quartiles
value.
Whole
mutations
y-axis,
separately
labelled.
incomplete
problem
high-throughput
ability
fill
entire
both
tasks,
yielded
predictions
established
strategies,
feature-wise
mean,
SoftImpute
(Mazumder
2010)
k-nearest
neighbour
(Troyanskaya
2001;
EV4,
S16),
GFA,
especially
case
S17).
EV4.
Imputation
A,
B.
Considered
SoftImpute,
mean
(Mean)
(kNN).
averages
squared
(MSE)
15
experiments
increasing
fractions
considering
(A)
random
random.
Error
plus
error.
Latent
predictive
outcomes
Finally,
utility
predictors
outcomes.
Three
significantly
next
treatment
(Cox
regression,
FDR
1%,
4A
B):
origin,
Factors,
7
8,
chemo-immunotherapy
prior
collection
(P
0.01,
t-test).
captures
del17p
TP53
oncogenes
(Garg
Fluhr
S18),
8
WNT
S19).
4.
Relationship
Association
univariate
Cox
regression
174
(96
European Journal of Immunology,
Journal Year:
2019,
Volume and Issue:
49(10), P. 1457 - 1973
Published: Oct. 1, 2019
These
guidelines
are
a
consensus
work
of
considerable
number
members
the
immunology
and
flow
cytometry
community.
They
provide
theory
key
practical
aspects
enabling
immunologists
to
avoid
common
errors
that
often
undermine
immunological
data.
Notably,
there
comprehensive
sections
all
major
immune
cell
types
with
helpful
Tables
detailing
phenotypes
in
murine
human
cells.
The
latest
techniques
applications
also
described,
featuring
examples
data
can
be
generated
and,
importantly,
how
analysed.
Furthermore,
tips,
tricks
pitfalls
avoid,
written
peer-reviewed
by
leading
experts
field,
making
this
an
essential
research
companion.
Nature Communications,
Journal Year:
2021,
Volume and Issue:
12(1)
Published: June 8, 2021
Abstract
To
fully
utilize
the
advances
in
omics
technologies
and
achieve
a
more
comprehensive
understanding
of
human
diseases,
novel
computational
methods
are
required
for
integrative
analysis
multiple
types
data.
Here,
we
present
multi-omics
method
named
Multi-Omics
Graph
cOnvolutional
NETworks
(MOGONET)
biomedical
classification.
MOGONET
jointly
explores
omics-specific
learning
cross-omics
correlation
effective
data
We
demonstrate
that
outperforms
other
state-of-the-art
supervised
approaches
from
different
classification
applications
using
mRNA
expression
data,
DNA
methylation
microRNA
Furthermore,
can
identify
important
biomarkers
related
to
investigated
problems.
Genes,
Journal Year:
2019,
Volume and Issue:
10(2), P. 87 - 87
Published: Jan. 28, 2019
Recent
developments
in
high-throughput
technologies
have
accelerated
the
accumulation
of
massive
amounts
omics
data
from
multiple
sources:
genome,
epigenome,
transcriptome,
proteome,
metabolome,
etc.
Traditionally,
each
source
(e.g.,
genome)
is
analyzed
isolation
using
statistical
and
machine
learning
(ML)
methods.
Integrative
analysis
multi-omics
clinical
key
to
new
biomedical
discoveries
advancements
precision
medicine.
However,
integration
poses
computational
challenges
as
well
exacerbates
ones
associated
with
single-omics
studies.
Specialized
approaches
are
required
effectively
efficiently
perform
integrative
acquired
diverse
modalities.
In
this
review,
we
discuss
state-of-the-art
ML-based
for
tackling
five
specific
analysis:
curse
dimensionality,
heterogeneity,
missing
data,
class
imbalance
scalability
issues.
Bioinformatics,
Journal Year:
2019,
Volume and Issue:
35(14), P. i501 - i509
Published: June 6, 2019
Historically,
gene
expression
has
been
shown
to
be
the
most
informative
data
for
drug
response
prediction.
Recent
evidence
suggests
that
integrating
additional
omics
can
improve
prediction
accuracy
which
raises
question
of
how
integrate
omics.
Regardless
integration
strategy,
clinical
utility
and
translatability
are
crucial.
Thus,
we
reasoned
a
multi-omics
approach
combined
with
datasets
would
relevance.
Frontiers in Genetics,
Journal Year:
2020,
Volume and Issue:
11
Published: Dec. 10, 2020
Multi-omics,
variously
called
integrated
omics,
pan-omics,
and
trans-omics,
aims
to
combine
two
or
more
omics
data
sets
aid
in
analysis,
visualization
interpretation
determine
the
mechanism
of
a
biological
process.
Multi-omics
efforts
have
taken
center
stage
biomedical
research
leading
development
new
insights
into
events
processes.
However,
mushrooming
myriad
tools,
datasets,
approaches
tends
inundate
literature
overwhelm
researchers
field.
The
this
review
are
provide
an
overview
current
state
field,
inform
on
available
reliable
resources,
discuss
application
statistics
machine/deep
learning
multi-omics
analyses,
findable,
accessible,
interoperable,
reusable
(FAIR)
research,
point
best
practices
benchmarking.
Thus,
we
guidance
interested
users
domain
by
addressing
challenges
underlying
biology,
giving
toolset,
common
pitfalls,
acknowledging
methods’
limitations.
We
conclude
with
practical
advice
recommendations
software
engineering
reproducibility
share
comprehensive
awareness
for
end-to-end
workflow.
Nucleic Acids Research,
Journal Year:
2023,
Volume and Issue:
51(W1), P. W310 - W318
Published: May 11, 2023
Abstract
Microbiome
studies
have
become
routine
in
biomedical,
agricultural
and
environmental
sciences
with
diverse
aims,
including
diversity
profiling,
functional
characterization,
translational
applications.
The
resulting
complex,
often
multi-omics
datasets
demand
powerful,
yet
user-friendly
bioinformatics
tools
to
reveal
key
patterns,
important
biomarkers,
potential
activities.
Here
we
introduce
MicrobiomeAnalyst
2.0
support
comprehensive
statistics,
visualization,
interpretation,
integrative
analysis
of
data
outputs
commonly
generated
from
microbiome
studies.
Compared
the
previous
version,
features
three
new
modules:
(i)
a
Raw
Data
Processing
module
for
amplicon
processing
taxonomy
annotation
that
connects
directly
Marker
Profiling
downstream
statistical
analysis;
(ii)
Metabolomics
help
dissect
associations
between
community
compositions
metabolic
activities
through
joint
paired
metabolomics
datasets;
(iii)
Statistical
Meta-Analysis
identify
consistent
signatures
by
integrating
across
multiple
Other
improvements
include
added
multi-factor
differential
interactive
visualizations
popular
graphical
outputs,
updated
methods
prediction
correlation
analysis,
expanded
taxon
set
libraries
based
on
latest
literature.
These
are
demonstrated
using
dataset
recent
type
1
diabetes
study.
is
freely
available
at
microbiomeanalyst.ca.
Medicinal Research Reviews,
Journal Year:
2020,
Volume and Issue:
41(3), P. 1427 - 1473
Published: Dec. 9, 2020
Abstract
Neurological
disorders
significantly
outnumber
diseases
in
other
therapeutic
areas.
However,
developing
drugs
for
central
nervous
system
(CNS)
remains
the
most
challenging
area
drug
discovery,
accompanied
with
long
timelines
and
high
attrition
rates.
With
rapid
growth
of
biomedical
data
enabled
by
advanced
experimental
technologies,
artificial
intelligence
(AI)
machine
learning
(ML)
have
emerged
as
an
indispensable
tool
to
draw
meaningful
insights
improve
decision
making
discovery.
Thanks
advancements
AI
ML
algorithms,
now
AI/ML‐driven
solutions
unprecedented
potential
accelerate
process
CNS
discovery
better
success
rate.
In
this
review,
we
comprehensively
summarize
AI/ML‐powered
pharmaceutical
efforts
their
implementations
area.
After
introducing
AI/ML
models
well
conceptualization
preparation,
outline
applications
technologies
several
key
procedures
including
target
identification,
compound
screening,
hit/lead
generation
optimization,
response
synergy
prediction,
de
novo
design,
repurposing.
We
review
current
state‐of‐the‐art
AI/ML‐guided
focusing
on
blood–brain
barrier
permeability
prediction
implementation
into
neurological
diseases.
Finally,
discuss
major
challenges
limitations
approaches
possible
future
directions
that
may
provide
resolutions
these
difficulties.