Evaluating the Utilities of Foundation Models in Single-cell Data Analysis DOI Open Access
Tianyu Liu, Kexing Li,

Yuge Wang

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Сен. 8, 2023

Foundation Models (FMs) have made significant strides in both industrial and scientific domains. In this paper, we evaluate the performance of FMs for single-cell sequencing data analysis through comprehensive experiments across eight downstream tasks pertinent to data. Overall, top include scGPT, Geneformer, CellPLM by considering model performances user accessibility among ten FMs. However, comparing these with task-specific methods, found that may not consistently excel than methods all tasks, which challenges necessity developing foundation models analysis. addition, evaluated effects hyper-parameters, initial settings, stability training based on a proposed

Язык: Английский

Multimodal deep learning approaches for single-cell multi-omics data integration DOI Creative Commons
Tasbiraha Athaya,

Rony Chowdhury Ripan,

Xiaoman Li

и другие.

Briefings in Bioinformatics, Год журнала: 2023, Номер 24(5)

Опубликована: Авг. 14, 2023

Integrating single-cell multi-omics data is a challenging task that has led to new insights into complex cellular systems. Various computational methods have been proposed effectively integrate these rapidly accumulating datasets, including deep learning. However, despite the proven success of learning in integrating and its better performance over classical methods, there no systematic study application integration. To fill this gap, we conducted literature review explore use multimodal techniques integration, taking account recent studies from multiple perspectives. Specifically, first summarized different modalities found data. We then reviewed current for processing categorized learning-based integration according modality, architecture, fusion strategy, key tasks downstream analysis. Finally, provided using models understand biological mechanisms.

Язык: Английский

Процитировано

36

CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data DOI

Jing Xu,

Aidi Zhang, Fang Liu

и другие.

Briefings in Bioinformatics, Год журнала: 2023, Номер 24(4)

Опубликована: Май 17, 2023

Abstract Single-cell omics technologies have made it possible to analyze the individual cells within a biological sample, providing more detailed understanding of systems. Accurately determining cell type each is crucial goal in single-cell RNA-seq (scRNA-seq) analysis. Apart from overcoming batch effects arising various factors, annotation methods also face challenge effectively processing large-scale datasets. With availability an increase scRNA-seq datasets, integrating multiple datasets and addressing originating diverse sources are challenges cell-type annotation. In this work, overcome challenges, we developed supervised method called CIForm based on Transformer for data. To assess effectiveness robustness CIForm, compared with some leading tools benchmark Through systematic comparisons under scenarios, exhibit that particularly pronounced The source code data available at https://github.com/zhanglab-wbgcas/CIForm.

Язык: Английский

Процитировано

33

Cellcano: supervised cell type identification for single cell ATAC-seq data DOI Creative Commons
Wenjing Ma, Jiaying Lu, Hao Wu

и другие.

Nature Communications, Год журнала: 2023, Номер 14(1)

Опубликована: Апрель 3, 2023

Computational cell type identification is a fundamental step in single-cell omics data analysis. Supervised celltyping methods have gained increasing popularity RNA-seq because of the superior performance and availability high-quality reference datasets. Recent technological advances profiling chromatin accessibility at resolution (scATAC-seq) brought new insights to understanding epigenetic heterogeneity. With continuous accumulation scATAC-seq datasets, supervised method specifically designed for urgent need. Here we develop Cellcano, computational based on two-round learning algorithm identify types from data. The alleviates distributional shift between target improves prediction performance. After systematically benchmarking Cellcano 50 well-designed tasks various show that accurate, robust, computationally efficient. well-documented freely available https://marvinquiet.github.io/Cellcano/ .

Язык: Английский

Процитировано

27

Decoding Aging Hallmarks at the Single-Cell Level DOI Creative Commons
Shuai Ma, Chi Xu, Yusheng Cai

и другие.

Annual Review of Biomedical Data Science, Год журнала: 2023, Номер 6(1), С. 129 - 152

Опубликована: Апрель 26, 2023

Organismal aging exhibits wide-ranging hallmarks in divergent cell types across tissues, organs, and systems. The advancement of single-cell technologies generation rich datasets have afforded the scientific community opportunity to decode these at an unprecedented scope resolution. In this review, we describe technological advancements bioinformatic methodologies enabling data interpretation cellular level. Then, outline application such for decoding potential intervention targets summarize common themes context-specific molecular features representative organ systems body. Finally, provide a brief summary available databases relevant research present outlook on opportunities emerging field.

Язык: Английский

Процитировано

25

Semi-supervised integration of single-cell transcriptomics data DOI Creative Commons
Massimo Andreatta, Léonard Hérault, Paul Gueguen

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Янв. 29, 2024

Batch effects in single-cell RNA-seq data pose a significant challenge for comparative analyses across samples, individuals, and conditions. Although batch effect correction methods are routinely applied, integration often leads to overcorrection can result the loss of biological variability. In this work we present STACAS, method scRNA-seq that leverages prior knowledge on cell types preserve variability upon integration. Through an open-source benchmark, show semi-supervised STACAS outperforms state-of-the-art unsupervised methods, as well supervised such scANVI scGen. scales large datasets is robust incomplete imprecise input type labels, which commonly encountered real-life tasks. We argue incorporation information should be common practice integration, provide flexible framework correction.

Язык: Английский

Процитировано

17

scCross: a deep generative model for unifying single-cell multi-omics with seamless integration, cross-modal generation, and in silico exploration DOI Creative Commons

Xiuhui Yang,

Koren K. Mann, Hao Wu

и другие.

Genome biology, Год журнала: 2024, Номер 25(1)

Опубликована: Июль 29, 2024

Single-cell multi-omics data reveal complex cellular states, providing significant insights into dynamics and disease. Yet, integration of presents challenges. Some modalities have not reached the robustness or clarity established transcriptomics. Coupled with scarcity for less intricacies, these challenges limit our ability to maximize single-cell omics benefits. We introduce scCross, a tool leveraging variational autoencoders, generative adversarial networks, mutual nearest neighbors (MNN) technique modality alignment. By enabling cross-modal generation, simulation, in silico perturbations, scCross enhances utility studies.

Язык: Английский

Процитировано

16

Single-cell omics: experimental workflow, data analyses and applications DOI
Fengying Sun, Haoyan Li, Dongqing Sun

и другие.

Science China Life Sciences, Год журнала: 2024, Номер unknown

Опубликована: Июль 23, 2024

Язык: Английский

Процитировано

13

Benchmarking multi-omics integration algorithms across single-cell RNA and ATAC data DOI Creative Commons

Chuxi Xiao,

Yixin Chen, Qiuchen Meng

и другие.

Briefings in Bioinformatics, Год журнала: 2024, Номер 25(2)

Опубликована: Янв. 22, 2024

Abstract Recent advancements in single-cell sequencing technologies have generated extensive omics data various modalities and revolutionized cell research, especially the RNA ATAC data. The joint analysis across scRNA-seq scATAC-seq has paved way to comprehending cellular heterogeneity complex regulatory networks. Multi-omics integration is gaining attention as an important step analysis, number of computational tools this field growing rapidly. In paper, we benchmarked 12 multi-omics methods on three tasks via qualitative visualization quantitative metrics, considering six main aspects that matter analysis. Overall, found different their own advantages aspects, while some outperformed other most aspects. We therefore provided guidelines for selecting appropriate specific scenarios help obtain meaningful insights from integration.

Язык: Английский

Процитировано

11

scmFormer Integrates Large‐Scale Single‐Cell Proteomics and Transcriptomics Data by Multi‐Task Transformer DOI Creative Commons
Jing Xu,

De‐Shuang Huang,

Xiujun Zhang

и другие.

Advanced Science, Год журнала: 2024, Номер 11(19)

Опубликована: Март 14, 2024

Abstract Transformer‐based models have revolutionized single cell RNA‐seq (scRNA‐seq) data analysis. However, their applicability is challenged by the complexity and scale of single‐cell multi‐omics data. Here a novel multi‐modal/multi‐task transformer (scmFormer) proposed to fill up existing blank integrating proteomics with other omics Through systematic benchmarking, it demonstrated that scmFormer excels in large‐scale multimodal heterogeneous multi‐batch paired data, while preserving shared information across batchs distinct biological information. achieves 54.5% higher average F1 score compared second method transferring cell‐type labels from transcriptomics Using COVID‐19 datasets, presented successfully integrates over 1.48 million cells on personal computer. Moreover, also proved performs better than methods generating unmeasured modality well‐suited for spatial multi‐omic Thus, powerful comprehensive tool analyzing

Язык: Английский

Процитировано

10

Understanding glioblastoma at the single-cell level: Recent advances and future challenges DOI Creative Commons
Yahaya A Yabo, Dieter Henrik Heiland

PLoS Biology, Год журнала: 2024, Номер 22(5), С. e3002640 - e3002640

Опубликована: Май 30, 2024

Glioblastoma, the most aggressive and prevalent form of primary brain tumor, is characterized by rapid growth, diffuse infiltration, resistance to therapies. Intrinsic heterogeneity cellular plasticity contribute its progression under therapy; therefore, there a need fully understand these tumors at single-cell level. Over past decade, transcriptomics has enabled molecular characterization individual cells within glioblastomas, providing previously unattainable insights into genetic features that drive tumorigenesis, disease progression, therapy resistance. However, despite advances in technologies, challenges such as high costs, complex data analysis interpretation, difficulties translating findings clinical practice persist. As technologies are developed further, more glioblastomas expected, which will help guide development personalized effective therapies, thereby improving prognosis quality life for patients.

Язык: Английский

Процитировано

10