Spatiotemporal Multi-Omics Mapping Generates a Molecular Atlas of the Aortic Valve and Reveals Networks Driving Disease DOI Open Access
Florian Schlotter, Arda Halu,

Shinji Goto

et al.

Circulation, Journal Year: 2018, Volume and Issue: 138(4), P. 377 - 393

Published: March 27, 2018

Background: No pharmacological therapy exists for calcific aortic valve disease (CAVD), which confers a dismal prognosis without invasive replacement. The search therapeutics and early diagnostics is challenging because CAVD presents in multiple pathological stages. Moreover, it occurs the context of complex, multi-layered tissue architecture; rich abundant extracellular matrix phenotype; unique, highly plastic, multipotent resident cell population. Methods: A total 25 human stenotic valves obtained from replacement surgeries were analyzed by modalities, including transcriptomics global unlabeled label-based tandem-mass-tagged proteomics. Segmentation into stage–specific samples was guided near-infrared molecular imaging, anatomic layer-specificity facilitated laser capture microdissection. Side-specific cultures subjected to calcifying stimuli, their calcification potential basal/stimulated proteomes evaluated. Molecular (protein–protein) interaction networks built, central proteins associations identified. Results: Global transcriptional protein expression signatures differed between nondiseased, fibrotic, stages CAVD. Anatomic microlayers exhibited unique proteome profiles that maintained throughout progression identified glial fibrillary acidic as specific marker valvular interstitial cells spongiosa layer. marked an emergence smooth muscle activation, inflammation, calcification-related pathways. Proteins overrepresented disease-prone fibrosa are functionally annotated fibrosis pathways, we found vitro, fibrosa-derived demonstrated greater than those ventricularis. These studies confirmed microlayer-specific preserved cultured cells, exposed alkaline phosphatase–dependent phosphatase–independent stimuli had distinct profiles, both overlapped with whole tissue. Analysis protein–protein significant closeness inflammatory fibrotic diseases. Conclusions: spatially temporally resolved multi-omics, network systems biology strategy identifies first regulatory CAVD, cardiac condition cure, describes novel means systematic ontology broadly applicable comprehensive omics cardiovascular

Language: Английский

Multivariable association discovery in population-scale meta-omics studies DOI Creative Commons
Himel Mallick, Ali Rahnavard, Lauren J. McIver

et al.

PLoS Computational Biology, Journal Year: 2021, Volume and Issue: 17(11), P. e1009442 - e1009442

Published: Nov. 16, 2021

It is challenging to associate features such as human health outcomes, diet, environmental conditions, or other metadata microbial community measurements, due in part their quantitative properties. Microbiome multi-omics are typically noisy, sparse (zero-inflated), high-dimensional, extremely non-normal, and often the form of count compositional measurements. Here we introduce an optimized combination novel established methodology assess multivariable association with complex population-scale observational studies. Our approach, MaAsLin 2 (Microbiome Multivariable Associations Linear Models), uses generalized linear mixed models accommodate a wide variety modern epidemiological studies, including cross-sectional longitudinal designs, well data types (e.g., counts relative abundances) without covariates repeated To construct this method, conducted large-scale evaluation broad range scenarios under which straightforward identification meta-omics associations can be challenging. These simulation studies reveal that 2’s model preserves statistical power presence measures multiple covariates, while accounting for nuances controlling false discovery. We also applied dataset from Integrative Human (HMP2) project which, addition reproducing results, revealed unique, integrated landscape inflammatory bowel diseases (IBD) across time points omics profiles.

Language: Английский

Citations

1408

Multi-omics Data Integration, Interpretation, and Its Application DOI Creative Commons

Indhupriya Subramanian,

Srikant Verma,

Shiva Kumar

et al.

Bioinformatics and Biology Insights, Journal Year: 2020, Volume and Issue: 14, P. 117793221989905 - 117793221989905

Published: Jan. 1, 2020

To study complex biological processes holistically, it is imperative to take an integrative approach that combines multi-omics data highlight the interrelationships of involved biomolecules and their functions. With advent high-throughput techniques availability generated from a large set samples, several promising tools methods have been developed for integration interpretation. In this review, we collected adopt analyze multiple omics summarized ability address applications such as disease subtyping, biomarker prediction, deriving insights into data. We provide methodology, use-cases, limitations these tools; brief account repositories visualization portals; challenges associated with integration.

Language: Английский

Citations

1019

Multi‐Omics Factor Analysis—a framework for unsupervised integration of multi‐omics data sets DOI Creative Commons
Ricard Argelaguet, Britta Velten, Damien Arnol

et al.

Molecular Systems Biology, Journal Year: 2018, Volume and Issue: 14(6)

Published: June 1, 2018

Method20 June 2018Open Access Transparent process Multi-Omics Factor Analysis—a framework for unsupervised integration of multi-omics data sets Ricard Argelaguet orcid.org/0000-0003-3199-3722 European Molecular Biology Laboratory, Bioinformatics Institute, Hinxton, Cambridge, UK Search more papers by this author Britta Velten orcid.org/0000-0002-8397-3515 Laboratory (EMBL), Heidelberg, Germany Damien Arnol orcid.org/0000-0003-2462-534X Sascha Dietrich orcid.org/0000-0002-0648-1832 Heidelberg University Hospital, Thorsten Zenz orcid.org/0000-0001-7890-9845 German Cancer Research Center (dkfz) and National Tumor Diseases (NCT), & Hematology, Hospital Zurich Zurich, Switzerland John C Marioni orcid.org/0000-0001-9092-0852 Cambridge Wellcome Trust Sanger Florian Buettner Corresponding Author [email protected] orcid.org/0000-0001-5587-6761 Helmholtz Zentrum München–German Environmental Health, Institute Computational Biology, Neuherberg, Wolfgang Huber orcid.org/0000-0002-0474-2218 Oliver Stegle orcid.org/0000-0002-8818-7193 Information Argelaguet1,‡, Velten2,‡, Arnol1, Dietrich3, Zenz3,4,5, Marioni1,6,7, *,1,8, *,2 *,1,2 1European 2European 3Heidelberg 4German 5Germany 6Cancer 7Wellcome 8Helmholtz ‡These authors contributed equally to work *Corresponding author. Tel: +49 89 23742560; E-mail: 6221 387 8823; 3878190; Systems (2018)14:e8124https://doi.org/10.15252/msb.20178124 PDFDownload PDF article text main figures. Peer ReviewDownload a summary the editorial decision including letters, reviewer comments responses feedback. ToolsAdd favoritesDownload CitationsTrack CitationsPermissions ShareFacebookTwitterLinked InMendeleyWechatReddit Figures Info Abstract Multi-omics studies promise improved characterization biological processes across molecular layers. However, methods resulting heterogeneous are lacking. We present Analysis (MOFA), computational method discovering principal sources variation in sets. MOFA infers set (hidden) factors that capture technical variability. It disentangles axes heterogeneity shared multiple modalities those specific individual modalities. The learnt enable variety downstream analyses, identification sample subgroups, imputation detection outlier samples. applied cohort 200 patient samples chronic lymphocytic leukaemia, profiled somatic mutations, RNA expression, DNA methylation ex vivo drug responses. identified major dimensions disease heterogeneity, immunoglobulin heavy-chain variable region status, trisomy chromosome 12 previously underappreciated drivers, such as response oxidative stress. In second application, we used analyse single-cell data, identifying coordinated transcriptional epigenetic changes along cell differentiation. Synopsis (MOFA) is discovery when omics assays same broadly applicable approach integration. inferred latent represent underlying Factors can be or data-type specific. model flexibly handles missing values different types. an application Chronic Lymphocytic Leukaemia, discovers low dimensional space spanned known clinical markers profiles from single-cells, recovers differentiation trajectories identifies between transcriptome epigenome. Introduction Technological advances increasingly layers probed parallel, ranging genome, epigenome, transcriptome, proteome metabolome phenome profiling (Hasin et al, 2017). Integrative analyses use information these deliver comprehensive insights into systems under study. Motivated this, domains, cancer biology (Gerstung 2015; Iorio 2016; Mertins Genome Atlas Network, 2017), regulatory genomics (Chen 2016), microbiology (Kim 2016) host-pathogen (Soderholm 2016). Most recent technological have also enabled performing at level (Macaulay Angermueller Guo 2017; Clark 2018; Colomé-Tatché Theis, 2018). A common aim applications characterize samples, manifested one several (Ritchie 2015). particularly appealing if relevant not priori, hence may missed consider single modality targeted approaches. basic strategy testing marginal associations prominent example quantitative trait locus mapping, where large numbers association tests performed genetic variants gene expression levels (GTEx Consortium, 2015) marks While em-inently useful variant annotation, inherently local do provide coherent global map differences kernel- graph-based combine types similarity network (Lanckriet 2004; Wang 2014); however, it difficult pinpoint determinants graph structure. Related there exist generalizations other clustering reconstruct discrete groups based on (Shen 2009; Mo 2013). key challenge sufficiently addressed approaches interpretability. particular, would desirable drive observed These could continuous gradients, clusters combinations thereof. Such help establishing explaining with external phenotypes covariates. Although factor models address been proposed (e.g. Meng 2014, Tenenhaus 2014; preprint: Singh 2018), either lack sparsity, which reduce interpretability, require substantial number parameters determined using computationally demanding cross-validation post hoc. Further challenges faced existing scalability larger sets, handling non-Gaussian modalities, binary readouts count-based traits. Results statistical integrating fashion. Intuitively, viewed versatile statistically rigorous generalization component analysis (PCA) data. Given matrices measurements partially overlapping interpretable low-dimensional representation terms (Fig 1A). thus facilitating gradients subgroups loadings sparse, thereby linkage most features. Importantly, what extent each unique 1B), revealing Once trained, output range visualization, classification space(s) factors, well automated annotation (gene set) enrichment analysis, 1B). Figure 1. Analysis: overview Model overview: takes M input (Y1,…, YM), modality, co-occurrent but features necessarily related differ numbers. decomposes matrix (Z) weight matrices, (W1,.., WM). White cells correspond zeros, i.e. inactive features, whereas cross symbol denotes values. fitted queried (i) variance decomposition, assessing proportion explained (ii) semi-automated inspection (iii) visualization (iv) values, assays. Download figure PowerPoint Technically, builds upon group (Virtanen 2012; Khan Klami Bunte Zhao Leppäaho Kaski, adapted requirements (Materials Methods): fast inference variational approximation, sparse solutions interpretation, efficient flexible combination likelihood enables diverse binary-, count- continuous-valued relationship previous Virtanen 2013; Remes Hore Leppáaho 2017) discussed Materials Methods Appendix Table S3. implemented well-documented open-source software comes tutorials workflows domains Methods). Taken together, functionalities powerful tool disentangling studies. validation comparison simulated First, validate MOFA, its generative model, varying views, models, Methods, S1). found was able accurately dimension, except settings high proportions (Appendix Fig account observations fit simulating count Figs S2 S3). compared two reported integration: GFA (Leppäaho iCluster (Mo Over simulations, tended infer redundant S4) were less accurate recovering patterns activity views S5). than EV1). For example, training CLL next, required 25 min versus 34 h 5–6 days iCluster. Click here expand figure. EV1. Scalability iClusterTime (red), (blue) (green) function K, D, N M. Baseline = 3, K 10, D 1,000 100 5% Shown average time 10 trials, error bars denote standard deviation. only shown lowest all training. Application leukaemia study (CLL), combined mutation (Dietrich 2A). Notably, nearly 40% some types; value scenario uncommon studies, designed cope Methods; configured order accommodate 2. A. Study Data rows (D features) (N) columns, grey bars. B, C. (B) Proportion total (R2) assay (C) cumulative explained. D. Absolute top 1 2 Mutations E. Visualization colours IGHV status tumours; shape colour tone indicate status. F. Number enriched Reactome per (FDR < 1%). categories pathways defined S2. (minimum 2% least type; robust algorithm initialization subsampling S6 S7). largely orthogonal, capturing independent S6). Among these, active assays, indicating broad roles 2B). contrast, 3 5 4 only. Cumulatively, 41% 38% mRNA 24% 2C). trained excluding probe their redundancy, finding still recovered, while others dependent type S8). 2013), consistent instances S9). important reveals axis attributed stress As part pipeline, provides strategies identify aetiology weights aligned (IGHV), 2D E). Thus, correctly them (Zenz 2010; Fabbri Dalla-Favera, marker associated 1, surrogate state tumour's origin activation B-cell receptor. practice generally considered (Fabbri our results complex substructure 3A, S10). At current resolution, three subgroup Oakes al (2016) Queiros (2015) S11), although suggestive evidence continuum. connected S12 S13), genes linked (Vasconcelos 2005; Maloum Trojani Morabito Plesingerova 3B C) drugs target kinases receptor pathway 3D 3. Characterization Beeswarm plot corresponding 3-means (LZ), intermediate (IZ) (HZ). largest absolute Plus minus symbols right sign loading. Genes highlighted orange described prognostic Heatmap (B). weights, annotated category. Drug curves stratified (A). Despite importance, accounted 20% suggesting existence heterogeneity. One 5, revealed tagged senescence (Figs 2F EV2A), heat-shock proteins (HSPs; EV2B C), essential protein folding up-regulated conditions (Srivastava, 2002; Åkerfelt 2010). HSP cancers tumour survival (Trachootham 2009), far family has received little attention context CLL. Consistent strongest stress, reactive oxygen species (ROS), damage apoptosis EV2D EV2. (oxidative factor) 5. Colours TNF, inflammatory marker. Gene (t-test, six Samples ordered Scaled loading, captured 9% suggested aetiologies immune T-cell signalling 2F), likely due composition samples: comprised mainly B cells, possible contamination T monocytes S14). 11% samples' general sensitivity (Geeleher S15). imputes Next, explored annotations, missing, mis-annotated inaccurate, since they frequently imperfect surrogates (Westra 2011). Since biomarker impacting care, assessed consistency 176 out patients, agreement further allowed classifying patients lacked clinically measured EV3A B). Interestingly, assigned label. Upon nine cases showed signatures, borderline classification; remaining clearly discordant EV3C D). Additional whole exome sequencing confirmed outliers within EV3E F). EV3. Prediction denoting predicted labels Pie chart showing imputed Sample-to-sample correlation ONO-4509 (not included data): Boxplots viability ONO-4509. middle; left right, viabilities M-CLL U-CLL shown, respectively. panels show concentrations tested. Boxes first third quartiles value. Whole mutations y-axis, separately labelled. incomplete problem high-throughput ability fill entire both tasks, yielded predictions established strategies, feature-wise mean, SoftImpute (Mazumder 2010) k-nearest neighbour (Troyanskaya 2001; EV4, S16), GFA, especially case S17). EV4. Imputation A, B. Considered SoftImpute, mean (Mean) (kNN). averages squared (MSE) 15 experiments increasing fractions considering (A) random random. Error plus error. Latent predictive outcomes Finally, utility predictors outcomes. Three significantly next treatment (Cox regression, FDR 1%, 4A B): origin, Factors, 7 8, chemo-immunotherapy prior collection (P 0.01, t-test). captures del17p TP53 oncogenes (Garg Fluhr S18), 8 WNT S19). 4. Relationship Association univariate Cox regression 174 (96

Language: Английский

Citations

922

mRNAs, proteins and the emerging principles of gene expression control DOI
Christopher Buccitelli, Matthias Selbach

Nature Reviews Genetics, Journal Year: 2020, Volume and Issue: 21(10), P. 630 - 644

Published: July 24, 2020

Language: Английский

Citations

921

Using machine learning approaches for multi-omics data analysis: A review DOI
Parminder Singh Reel, Smarti Reel, Ewan R. Pearson

et al.

Biotechnology Advances, Journal Year: 2021, Volume and Issue: 49, P. 107739 - 107739

Published: March 29, 2021

Language: Английский

Citations

525

Quantification of the pace of biological aging in humans through a blood test, the DunedinPoAm DNA methylation algorithm DOI Creative Commons
Daniel W. Belsky, Avshalom Caspi, Louise Arseneault

et al.

eLife, Journal Year: 2020, Volume and Issue: 9

Published: May 5, 2020

Biological aging is the gradual, progressive decline in system integrity that occurs with advancing chronological age, causing morbidity and disability. Measurements of pace are needed as surrogate endpoints trials therapies designed to prevent disease by slowing biological aging. We report a blood-DNA-methylation measure sensitive variation among individuals born same year. first modeled change-over-time 18 biomarkers tracking organ-system across 12 years follow-up n = 954 members Dunedin Study 1972–1973. Rates change each biomarker over ages 26–38 were composited form aging-related decline, termed Pace-of-Aging. Elastic-net regression was used develop DNA-methylation predictor Pace-of-Aging, called DunedinPoAm for Dunedin(P)ace(o)f(A)ging(m)ethylation. Validation analysis cohort studies CALERIE trial provide proof-of-principle single-time-point person’s

Language: Английский

Citations

417

SeSAMe: reducing artifactual detection of DNA methylation by Infinium BeadChips in genomic deletions DOI Creative Commons
Wanding Zhou, Timothy J. Triche, Peter W. Laird

et al.

Nucleic Acids Research, Journal Year: 2018, Volume and Issue: unknown

Published: July 20, 2018

We report a new class of artifacts in DNA methylation measurements from Illumina HumanMethylation450 and MethylationEPIC arrays. These reflect failed hybridization to target DNA, often due germline or somatic deletions manifest as incorrectly reported intermediate methylation. The survive existing preprocessing pipelines, masquerade epigenetic alterations can confound discoveries epigenome-wide association studies methylation-quantitative trait loci. implement solution, P-value with out-of-band (OOB) array (pOOBAH), the R package SeSAMe. Our method effectively masks deleted hyperpolymorphic regions, reducing eliminating spurious reports silencing at oft-deleted tumor suppressor genes such CDKN2A RB1 cases deletions. Furthermore, our substantially decreases technical variation whilst retaining biological variation, both within across HM450 EPIC platform measurements. SeSAMe provides light-weight, modular data analysis suite, performant implementation suitable for efficient thousands samples.

Language: Английский

Citations

412

The Need for Multi-Omics Biomarker Signatures in Precision Medicine DOI Open Access
Michael Olivier, Reto Asmis,

Gregory A. Hawkins

et al.

International Journal of Molecular Sciences, Journal Year: 2019, Volume and Issue: 20(19), P. 4781 - 4781

Published: Sept. 26, 2019

Recent advances in omics technologies have led to unprecedented efforts characterizing the molecular changes that underlie development and progression of a wide array complex human diseases, including cancer. As result, multi-omics analyses—which take advantage these genomics, transcriptomics, epigenomics, proteomics, metabolomics, other areas—have been proposed heralded as key advancing precision medicine clinic. In field oncology, genomics approaches, and, more recently, analyses helped reveal several mechanisms cancer development, treatment resistance, recurrence risk, findings implemented clinical oncology help guide decisions. However, truly integrated not applied widely, preventing further medicine. Additional are needed develop analytical infrastructure necessary generate, analyze, annotate data effectively inform medicine-based decision-making.

Language: Английский

Citations

398

Integration strategies of multi-omics data for machine learning analysis DOI Creative Commons
Milan Picard, Marie‐Pier Scott‐Boyer, Antoine Bodein

et al.

Computational and Structural Biotechnology Journal, Journal Year: 2021, Volume and Issue: 19, P. 3735 - 3746

Published: Jan. 1, 2021

Increased availability of high-throughput technologies has generated an ever-growing number omics data that seek to portray many different but complementary biological layers including genomics, epigenomics, transcriptomics, proteomics, and metabolomics. New insight from these have been obtained by machine learning algorithms produced diagnostic classification biomarkers. Most biomarkers date however only include one omic measurement at a time thus do not take full advantage recent multi-omics experiments now capture the entire complexity systems. Multi-omics integration strategies are needed combine knowledge brought each layer. We summarized most methods/ frameworks into five strategies: early, mixed, intermediate, late hierarchical. In this mini-review, we focus on challenges existing paying special attention applications.

Language: Английский

Citations

368

Harnessing multimodal data integration to advance precision oncology DOI
Kevin M. Boehm, Pegah Khosravi, R. Vanguri

et al.

Nature reviews. Cancer, Journal Year: 2021, Volume and Issue: 22(2), P. 114 - 126

Published: Oct. 18, 2021

Language: Английский

Citations

332