DANCE: a deep learning library and benchmark platform for single-cell analysis DOI Creative Commons
Jiayuan Ding, Renming Liu, Hongzhi Wen

et al.

Genome biology, Journal Year: 2024, Volume and Issue: 25(1)

Published: March 19, 2024

DANCE is the first standard, generic, and extensible benchmark platform for accessing evaluating computational methods across spectrum of datasets numerous single-cell analysis tasks. Currently, supports 3 modules 8 popular tasks with 32 state-of-art on 21 datasets. People can easily reproduce results supported algorithms major via minimal efforts, such as using only one command line. In addition, provides an ecosystem deep learning architectures tools researchers to facilitate their own model development. open-source Python package that welcomes all kinds contributions.

Language: Английский

On the Opportunities and Risks of Foundation Models DOI Creative Commons
Rishi Bommasani,

Drew A. Hudson,

Ehsan Adeli

et al.

arXiv (Cornell University), Journal Year: 2021, Volume and Issue: unknown

Published: Jan. 1, 2021

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and adaptable to wide range downstream tasks. We call these foundation underscore their critically central yet incomplete character. This report provides thorough account opportunities risks models, ranging from capabilities language, vision, robotics, reasoning, human interaction) technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) applications law, healthcare, education) societal impact inequity, misuse, economic environmental impact, legal ethical considerations). Though based standard deep learning transfer learning, results in new emergent capabilities,and effectiveness across so many tasks incentivizes homogenization. Homogenization powerful leverage but demands caution, as defects inherited by all adapted downstream. Despite impending widespread deployment we currently lack clear understanding how they work, when fail, what even capable due properties. To tackle questions, believe much critical research will require interdisciplinary collaboration commensurate fundamentally sociotechnical nature.

Language: Английский

Citations

1553

Methods and applications for single-cell and spatial multi-omics DOI Open Access
Katy Vandereyken, Alejandro Sifrim, Bernard Thienpont

et al.

Nature Reviews Genetics, Journal Year: 2023, Volume and Issue: 24(8), P. 494 - 515

Published: March 2, 2023

Language: Английский

Citations

606

Current progress and open challenges for applying deep learning across the biosciences DOI Creative Commons
Nicolae Sapoval, Amirali Aghazadeh, Michael Nute

et al.

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: April 1, 2022

Deep Learning (DL) has recently enabled unprecedented advances in one of the grand challenges computational biology: half-century-old problem protein structure prediction. In this paper we discuss recent advances, limitations, and future perspectives DL on five broad areas: prediction, function genome engineering, systems biology data integration, phylogenetic inference. We each application area cover main bottlenecks approaches, such as training data, scope, ability to leverage existing architectures new contexts. To conclude, provide a summary subject-specific general for across biosciences.

Language: Английский

Citations

218

scJoint integrates atlas-scale single-cell RNA-seq and ATAC-seq data with transfer learning DOI
Yingxin Lin,

Tung-Yu Wu,

Sheng Wan

et al.

Nature Biotechnology, Journal Year: 2022, Volume and Issue: 40(5), P. 703 - 710

Published: Jan. 20, 2022

Language: Английский

Citations

110

Cobolt: integrative analysis of multimodal single-cell sequencing data DOI Creative Commons
Boying Gong, Yun Zhou, Elizabeth Purdom

et al.

Genome biology, Journal Year: 2021, Volume and Issue: 22(1)

Published: Dec. 28, 2021

Abstract A growing number of single-cell sequencing platforms enable joint profiling multiple omics from the same cells. We present , a novel method that not only allows for analyzing data joint-modality platforms, but provides coherent framework integration datasets measured on different modalities. demonstrate its performance multi-modality gene expression and chromatin accessibility illustrate abilities by jointly this with RNA-seq ATAC-seq datasets.

Language: Английский

Citations

104

scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks DOI
Han Yuan, David R. Kelley

Nature Methods, Journal Year: 2022, Volume and Issue: 19(9), P. 1088 - 1096

Published: Aug. 8, 2022

Language: Английский

Citations

79

Single cell cancer epigenetics DOI Creative Commons
Marta Casado-Peláez, Alberto Bueno-Costa, Manel Esteller

et al.

Trends in cancer, Journal Year: 2022, Volume and Issue: 8(10), P. 820 - 838

Published: July 9, 2022

The epigenome encompasses several mechanisms controlling gene expression that can be aberrantly regulated during cancer development and progression. Tumors are highly complex heterogeneous biological systems require the study of epigenetic alterations at a single cell resolution.Several technologies developed to different layers epigenome, such as chromatin accessibility or histone modifications, have been applied in research over past few years, improving our understanding driving tumorigenesis.Although these techniques promising, most still nascent present limitations, low throughput limited coverage. In addition, analysis integration various epigenomic data modalities challenges new computational tools. Bulk sequencing methodologies allowed us make great progress research. Unfortunately, lack resolution fully unravel govern tumor heterogeneity. Consequently, many novel cell-sequencing decade, allowing explore components regulate aspects heterogeneity, namely: clonal microenvironment (TME), spatial organization, intratumoral differentiation programs, metastasis, resistance mechanisms. this review, we enable researchers epigenetics (DNA methylation, accessibility, DNA–protein interactions, 3D architecture) level, their potential applications cancer, current technical limitations. importance both basic clinical is indisputable. field important implications for disease. Indeed, non-mutational reprogramming was recently designated mechanistic determinant enables acquisition hallmark capabilities [1.Hanahan D. Hallmarks cancer: dimensions.Cancer Discov. 2022; 12: 31-46Crossref PubMed Scopus (326) Google Scholar]. Although it well established cells may arise from genetic mutations drive carcinogenesis, types strong drivers could explain malignant processes, progression [2.Turajilic S. et al.Resolving heterogeneity cancer.Nat. Rev. Genet. 2019; 20: 404-416Crossref (228) Scholar], therapy [3.Shaffer S.M. al.Rare variability drug-induced mode drug resistance.Nature. 2017; 546: 431-435Crossref (508) metastasis [4.Chen J.F. Yan Q. roles metastasis.Biochem. J. 2021; 478: 3373-3393Crossref (2) suggesting non-genetic determinants crucial role [5.Nam A.S. al.Integrating evolution by single-cell multi-omics.Nat. 22: 3-18Crossref (83) Thus, affecting non-malignant act critical evolution. These mechanisms, which genes without altering DNA sequence, fall into five main categories: (i) methylation; (ii) accessibility; (iii) modifications; (iv) interactions; (v) tridimensional architecture [6.Allis C.D. Jenuwein T. molecular hallmarks control.Nat. 2016; 17: 487-500Crossref Scholar,7.Darwiche N. Epigenetic an intimate affair.Am. Cancer Res. 2020; 10: 1954-1978PubMed Each type mechanism experimentally studied using bulk (Box 1). due cellular tumor, valuable information lost when techniques, since all possible retrieved point view masked averaging. Nonetheless, with emergence Scholar,8.Yalan L. al.Applications research: perspectives.J. Hematol. Oncol. 14: 91Crossref (15) tumoral were otherwise impossible assess now open exploration.Box 1Bulk analyze mechanismsVarious used understand mechanisms: taking advantage bisulfite chemistry, analyzed whole-genome (WGBS), reduced representation (RRBS), 450k/850k Illumina methylation arrays [128.Ortiz-Barahona V. al.Use profiling translational oncology.Semin. Biol. 83: 523-535Crossref (7) Scholar]; mainly assay transposase-accessible (ATAC-seq) [129.Marinov G.K. Shipony Z. Interrogating accessible landscape eukaryote genomes ATAC-seq.Methods Mol. 2243: 183-226Crossref (0) modifications interactions immunoprecipitation (ChIP-seq) [130.Nakato R. Sakata Methods ChIP-seq analysis: practical workflow advanced applications.Methods. 187: 44-53Crossref (6) explored multiple chromosome conformation capture technology, 3C, 4C, 5C, Hi-C, promoter-capture ChIA-PET [131.Sati Cavalli G. Chromosome impact genome function.Chromosoma. 126: 33-44Crossref (97) Scholar,132.Javierre B.M. al.Lineage-specific links enhancers non-coding disease variants target promoters.Cell. 167: 1369-1384Abstract Full Text PDF (486) One common drawback among need considerable sample size, demanding thousands millions minimal input. considered 'bulk methodologies', obtain average value whole-cell [133.Carter B. Zhao K. basis heterogeneity.Nat. 235-250Crossref (3) Various deconvolution strategies data, but substantial risk retrieving artifacts losing difficult-to-detect minor subclones [134.Chakravarthy A. al.Pan-cancer tumour composition methylation.Nat. Commun. 2018; 9: 3220Crossref (114) Nevertheless, indispensable tools its relationship cancer. For example, they methylation-based classification diffuse gliomas (LGm1-LGm6) [135.Ceccarelli M. al.Molecular reveals biologically discrete subsets pathways glioma.Cell. 164: 550-563Abstract cancers unknown primary [123.Moran al.Epigenetic classify primary: multicentre, retrospective analysis.Lancet 1386-1395Abstract (251) modification-based tracking states [136.Völker-Albert al.Histone stem implications.Stem Cell Rep. 15: 1196-1205Abstract A entity comprising cells, each has [9.Dagogo-Jack I. Shaw A.T. Tumour therapies.Nat. Clin. 81-94Crossref (1187) help properly dissect dependencies complexity. There six biology related key (Figure 1): heterogeneity; TME; organization intercellular crosstalk; developmental programs (phenotypic plasticity); metastasis; (vi) appearance therapy. necessary develop allow cues undetectable methodologies. catalog facilitate characteristics level. We technology based on under architecture), focusing first mono-omic (techniques only one cell) then multi-omic (which simultaneously cell). summarize currently available also highlight recent discoveries insights gained technologies, how contribute solve (mostly derived heterogeneity), translational/clinical scenarios. comprise subclones, unique properties [10.McGranahan Swanton C. Clonal evolution: past, present, future.Cell. 168: 613-628Abstract (1220) Scholar,11.Oakes C.C. al.Evolution linked aberrations chronic lymphocytic leukemia.Cancer 2014; 4: 348-361Crossref (113) As population evolves, accumulate clones harbor novel, selective advantages (e.g., enhanced proliferation, therapy, invasiveness, etc.) detection outcome. Single able detect (especially minor, difficult-to-detect, subclones), thus revealing prognostic information. does not solely harbors myriad distinct T content directly associated types, cytotoxic (Tc) helper (Th1, Th2, Th17) correlating good prognosis [12.Tay R.E. al.Revisiting CD4 + immunotherapy-new old paradigms.Cancer Gene Ther. 28: 5-17Crossref Tumor-associated macrophages progression, depending M1/M2 state [13.Baghban al.Tumor complexity therapeutic glance.Cell Signal. 18: 59Crossref (321) Additionally, natural killer (NK) B endothelial fibroblasts, other participate interactome [14.Anderson N.M. Simon M.C. microenvironment.Curr. 30: R921-R925Abstract (186) microenvironmental profoundly modulate nontumoral generating crosstalk determines [15.Marks D.L. control microenvironment.Epigenomics. 8: 1671-1687Crossref (32) studying signals level mandatory decipher interactome. Malignant randomly distributed inside instead occupy specific positions space, cell–cell [16.Noble al.Spatial structure governs evolution.Nat. Ecol. Evol. 6: 207-217Crossref Knowing distribution assessing 'heat' certain melanoma), relative quantity position [17.Trujillo J.A. al.T cell-inflamed versus non-T tumors: conceptual framework immunotherapy combination selection.Cancer Immunol. 990-1000Crossref (202) dependent colorectal (CRC) patients locoregional relapse-free overall survival [18.Martínez-Cardús homogeneity within tumors predicts shorter times cancer.Gastroenterology. 151: 961-972Abstract Microscopy immunohistochemistry. enabled advances aspect. specificity unveil cell. (including cutting-edge epigenomics) will infer correlate changes value. hypothesis growth depends, least part, asymmetrical divisions differentiate committed [19.Lim J.R. al.Cancer targets.Med. 38: 76Crossref background, follow program glioblastoma, there four program, some higher stemness (neural progenitor-like oligodendrocyte cells), others more differentiated (astrocyte-like mesenchymal-like cells) [20.Neftel al.An integrative model states, plasticity, genetics glioblastoma.Cell. 178: 835-849Abstract (598) identity maintained memory methylation) ensure full commitment transcriptional [21.Lee H.J. Reprogramming methylome: erasing creating diversity.Cell Stem Cell. 710-719Abstract (223) detecting machinery provide trajectories, predicting deciding treatment should applied. Some acquire ability leave site colonize distant tissues, cause cancer-related deaths [22.Fares principles metastasis: revisited.Signal Transduct. Target. 5: 28Crossref (390) From transformation until settlement tissue, metastatic experiences drastic changes, acquiring motility (epithelial–to-mesenchymal transition), avoiding immune surveillance, adapting secondary No driver yet identified, dynamic involved steps Scholar,23.Patel S.A. Vanharanta metastasis.Mol. 11: 79-96Crossref (37) useful confidently sites those metastatic-prone background. Certain abundance epimutations render them resistant treatments affect abundant [24.Wang X. al.Drug combating cancer.Cancer Drug Resist. 2: 141-160PubMed likely become predominant ones after line treatment, representing relapse. Alterations strongly antitumoral [25.Hayashi Konishi Correlation anti-tumour regulation.Br. Cancer. 124: 681-682Crossref (4) bortezomib myeloma, enter slow-cycling, drug-tolerant reversible state, consequence plasticity rather than determinants. Another case non-genetically determined H3 lysine 4 demethylases, KDM5, transcriptomic breast leading decreased sensitivity antiestrogens [26.Hinohara al.KDM5 demethylase activity resistance.Cancer 34: 939-953Abstract (96) taxane-resistant triple-negative (TNBC), global hypomethylation relocation H3K27 trimethylation paclitaxel, vulnerability inhibitors [27.Deblois switch-induced viral mimicry evasion chemotherapy-resistant 1312-1329Crossref well-documented cases [28.Marine J.-C. al.Non-genetic 743-756Crossref early diagnosis, would significantly clinicians select best combinations. residual fundamental, because constitutes biomarker predict relapse [29.Schuurhuis G.J. al.Minimal/measurable AML: consensus document European Leukemia Net MRD Working Party.Blood. 131: 1275-1291Crossref (537) encompass breakthrough methodology revolutionized way characterized looking time. approaches essential examine underlying levels. With advent RNA-sequencing (scRNA-seq), transcriptome exploited technologies. It accelerated biology, enabling characterization unprecedented Scholar,30.van Galen P. al.Single-cell RNA-seq AML hierarchies relevant immunity.Cell. 176: 1265-1281Abstract (288) Scholar, 31.Costa al.Fibroblast immunosuppressive environment human 33: 463-479Abstract (602) 32.Aoki disease-defining t-cell classic hodgkin lymphoma.Cancer 406-421Crossref (82) 33.Campillo-Marcos analyses hematopoiesis hematological malignancies.Exp. 98: 1-13Abstract emerging DNA-sequencing profile, amplicon-based targeted manner, recurrently mutated genes, providing genotype every nucleotide (SNPs) copy number (CNVs) [34.Miles L.A. mutation myeloid malignancies.Nature. 587: 477-482Crossref (117) diversity often independent highlighting developing [35.Chaligne encoding, heritability glioma states.Nat. 53: 1469-1479Crossref (20) aimed evolved rapidly compared transcriptome, being regulation. 5-Methylcytosine (5mC) well-known modification. mammals, mostly occurs cytosines followed guanine, forming 5′-to-3′ CpG pair. Approximately 70% promoters enriched clustered pairs, 'CpG islands' prone 5mC [36.Deaton A.M. Bird islands regulation transcription.Genes Dev. 2011; 25: 1010-1022Crossref (1966) regions, acts repressive switch, restricting expression. found genomic bodies regulatory regions (enhancers CTCF sites), regulating function cis. deregulated. Promoter hyper/hypomethylation suppressors/oncogenes well-established [37.Berdasco Esteller Clinical epigenetics: seizing opportunities translation.Nat. 109-127Crossref (227) deregulation enhancer fostering [38.Bell al.Enhancer dynamics patient mortality.Genome 26: 601-611Crossref (73) helped revolutionize area. Most conversion unmethylated uracil DNA. This allows methylated array-based methods al.Epi

Language: Английский

Citations

77

Artificial intelligence and machine learning approaches using gene expression and variant data for personalized medicine DOI

Sreya Vadapalli,

Habiba Abdelhalim, Saman Zeeshan

et al.

Briefings in Bioinformatics, Journal Year: 2022, Volume and Issue: 23(5)

Published: April 27, 2022

Abstract Precision medicine uses genetic, environmental and lifestyle factors to more accurately diagnose treat disease in specific groups of patients, it is considered one the most promising medical efforts our time. The use genetics arguably data-rich complex components precision medicine. grand challenge today successful assimilation into that translates across different ancestries, diverse diseases other distinct populations, which will require clever artificial intelligence (AI) machine learning (ML) methods. Our goal here was review compare scientific objectives, methodologies, datasets, data sources, ethics gaps AI/ML approaches used genomics We selected high-quality literature published within last 5 years were indexed available through PubMed Central. scope narrowed articles reported application algorithms for statistical predictive analyses using whole genome and/or exome sequencing gene variants, RNA-seq microarrays expression. did not limit search or sources. Based on comparative analysis criteria, we identified 32 applied variable studies report widely adapted diagnostics several diseases.

Language: Английский

Citations

73

Deep learning applications in single-cell genomics and transcriptomics data analysis DOI Creative Commons
Nafiseh Erfanian, A. Ali Heydari, Adib Miraki Feriz

et al.

Biomedicine & Pharmacotherapy, Journal Year: 2023, Volume and Issue: 165, P. 115077 - 115077

Published: July 1, 2023

Traditional bulk sequencing methods are limited to measuring the average signal in a group of cells, potentially masking heterogeneity, and rare populations. The single-cell resolution, however, enhances our understanding complex biological systems diseases, such as cancer, immune system, chronic diseases. However, technologies generate massive amounts data that often high-dimensional, sparse, complex, thus making analysis with traditional computational approaches difficult unfeasible. To tackle these challenges, many turning deep learning (DL) potential alternatives conventional machine (ML) algorithms for studies. DL is branch ML capable extracting high-level features from raw inputs multiple stages. Compared ML, models have provided significant improvements across domains applications. In this work, we examine applications genomics, transcriptomics, spatial multi-omics integration, address whether techniques will prove be advantageous or if omics domain poses unique challenges. Through systematic literature review, found has not yet revolutionized most pressing challenges field. using shown promising results (in cases outperforming previous state-of-the-art models) preprocessing downstream analysis. Although developments generally been gradual, recent advances reveal can offer valuable resources fast-tracking advancing research single-cell.

Language: Английский

Citations

58

The future of rapid and automated single-cell data analysis using reference mapping DOI Creative Commons
Mohammad Lotfollahi, Yuhan Hao, Fabian J. Theis

et al.

Cell, Journal Year: 2024, Volume and Issue: 187(10), P. 2343 - 2358

Published: May 1, 2024

As the number of single-cell datasets continues to grow rapidly, workflows that map new data well-curated reference atlases offer enormous promise for biological community. In this perspective, we discuss key computational challenges and opportunities reference-mapping algorithms. We how mapping algorithms will enable integration diverse across disease states, molecular modalities, genetic perturbations, species eventually replace manual laborious unsupervised clustering pipelines.

Language: Английский

Citations

25