Universal Cell Embeddings: A Foundation Model for Cell Biology DOI Creative Commons
Yanay Rosen, Yusuf Roohani, Ayush Agrawal

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Ноя. 29, 2023

Developing a universal representation of cells which encompasses the tremendous molecular diversity cell types within human body and more generally, across species, would be transformative for biology. Recent work using single-cell transcriptomic approaches to create definitions in form atlases has provided necessary data such an endeavor. Here, we present Universal Cell Embedding (UCE) foundation model. UCE was trained on corpus atlas from other species completely self-supervised way without any annotations. offers unified biological latent space that can represent cell, regardless tissue or species. This embedding captures important variation despite presence experimental noise diverse datasets. An aspect UCE's universality is new organism mapped this with no additional labeling, model training fine-tuning. We applied Integrated Mega-scale Atlas, 36 million cells, than 1,000 uniquely named types, hundreds experiments, dozens tissues eight uncovered insights about organization space, leveraged it infer function newly discovered types. exhibits emergent behavior, uncovering biology never explicitly for, as identifying developmental lineages novel not included set. Overall, by enabling every state type, provides valuable tool analysis, annotation hypothesis generation scale single datasets continues grow.

Язык: Английский

MUON: multimodal omics analysis framework DOI Creative Commons
Danila Bredikhin, Ilia Kats, Oliver Stegle

и другие.

Genome biology, Год журнала: 2022, Номер 23(1)

Опубликована: Фев. 1, 2022

Advances in multi-omics have led to an explosion of multimodal datasets address questions from basic biology translation. While these data provide novel opportunities for discovery, they also pose management and analysis challenges, thus motivating the development tailored computational solutions. Here, we present a standard framework multi-omics, MUON, designed organise, analyse, visualise, exchange data. MUON stores efficient yet flexible interoperable structure. enables versatile range analyses, preprocessing alignment.

Язык: Английский

Процитировано

103

Liver zonation, revisited DOI Creative Commons
J Paris, Neil C. Henderson

Hepatology, Год журнала: 2022, Номер 76(4), С. 1219 - 1230

Опубликована: Фев. 17, 2022

Abstract The concept of hepatocyte functional zonation is well established, with differences in metabolism and xenobiotic processing determined by multiple factors including oxygen nutrient levels across the hepatic lobule. However, recent advances single‐cell genomics technologies, nuclei RNA sequencing, rapidly evolving fields spatial transcriptomic proteomic profiling have greatly increased our understanding liver zonation. Here we discuss how these transformative experimental strategies are being leveraged to dissect at unprecedented resolution this new information should facilitate emergence novel precision medicine‐based therapies for patients disease.

Язык: Английский

Процитировано

91

Single-cell biological network inference using a heterogeneous graph transformer DOI Creative Commons
Anjun Ma, Xiaoying Wang, Jingxian Li

и другие.

Nature Communications, Год журнала: 2023, Номер 14(1)

Опубликована: Фев. 21, 2023

Abstract Single-cell multi-omics (scMulti-omics) allows the quantification of multiple modalities simultaneously to capture intricacy complex molecular mechanisms and cellular heterogeneity. Existing tools cannot effectively infer active biological networks in diverse cell types response these external stimuli. Here we present DeepMAPS for network inference from scMulti-omics. It models scMulti-omics a heterogeneous graph learns relations among cells genes within both local global contexts robust manner using multi-head transformer. Benchmarking results indicate performs better than existing clustering construction. also showcases competitive capability deriving cell-type-specific lung tumor leukocyte CITE-seq data matched diffuse small lymphocytic lymphoma scRNA-seq scATAC-seq data. In addition, deploy webserver equipped with functionalities visualizations improve usability reproducibility analysis.

Язык: Английский

Процитировано

88

Harmonized single-cell landscape, intercellular crosstalk and tumor architecture of glioblastoma DOI Creative Commons
Cristian Ruiz-Moreno, Sergio Marco Salas, Erik Samuelsson

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2022, Номер unknown

Опубликована: Авг. 27, 2022

SUMMARY Glioblastoma, isocitrate dehydrogenase (IDH)-wildtype (hereafter, GB), is an aggressive brain malignancy associated with a dismal prognosis and poor quality of life. Single-cell RNA sequencing has helped to grasp the complexity cell states dynamic changes in GB. Large-scale data integration can help uncover unexplored tumor pathobiology. Here, we resolved composition milieu created cellular map GB (‘GBmap’), curated resource that harmonizes 26 datasets gathering 240 patients spanning over 1.1 million cells. We showcase applications our for reference mapping, transfer learning, biological discoveries. Our results sources pro-angiogenic signaling multifaceted role mesenchymal-like cancer Reconstructing architecture using spatially transcriptomics unveiled high level well-structured neoplastic niches. The GBmap represents framework allows streamlined interpretation new provides platform exploratory analysis, hypothesis generation testing.

Язык: Английский

Процитировано

83

An integrated cell atlas of the human lung in health and disease DOI Creative Commons
Lisa Sikkema, Daniel Strobl, Luke Zappia

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2022, Номер unknown

Опубликована: Март 11, 2022

ABSTRACT Organ- and body-scale cell atlases have the potential to transform our understanding of human biology. To capture variability present in population, these must include diverse demographics such as age ethnicity from both healthy diseased individuals. The growth size number single-cell datasets, combined with recent advances computational techniques, for first time makes it possible generate comprehensive large-scale through integration multiple datasets. Here, we integrated Human Lung Cell Atlas (HLCA) combining 46 datasets respiratory system into a single atlas spanning over 2.2 million cells 444 individuals across health disease. HLCA contains consensus re-annotation published newly generated resolving under- or misannotation 59% original enables recovery rare types, provides marker genes each type, uncovers gene modules associated demographic covariates anatomical location within system. facilitate use reference lung research allow rapid analysis new data, provide an interactive web portal project onto HLCA. Finally, demonstrate value interpreting disease-associated changes. Thus, outlines roadmap development organ-scale Atlas.

Язык: Английский

Процитировано

80

ISSAAC-seq enables sensitive and flexible multimodal profiling of chromatin accessibility and gene expression in single cells DOI
Wei Xu,

Weilong Yang,

Yunlong Zhang

и другие.

Nature Methods, Год журнала: 2022, Номер 19(10), С. 1243 - 1249

Опубликована: Сен. 15, 2022

Язык: Английский

Процитировано

74

scDesign3 generates realistic in silico data for multimodal single-cell and spatial omics DOI
Dongyuan Song, Qingyang Wang, Guanao Yan

и другие.

Nature Biotechnology, Год журнала: 2023, Номер 42(2), С. 247 - 252

Опубликована: Май 11, 2023

Язык: Английский

Процитировано

73

Single-cell genomics meets human genetics DOI
Anna Cuomo, Aparna Nathan, Soumya Raychaudhuri

и другие.

Nature Reviews Genetics, Год журнала: 2023, Номер 24(8), С. 535 - 549

Опубликована: Апрель 21, 2023

Язык: Английский

Процитировано

72

Computational Approaches and Challenges in Spatial Transcriptomics DOI Creative Commons
Shuangsang Fang, Bichao Chen, Yong Zhang

и другие.

Genomics Proteomics & Bioinformatics, Год журнала: 2022, Номер 21(1), С. 24 - 47

Опубликована: Окт. 14, 2022

The development of spatial transcriptomics (ST) technologies has transformed genetic research from a single-cell data level to two-dimensional coordinate system and facilitated the study composition function various cell subsets in different environments organs. large-scale generated by these ST technologies, which contain gene expression information, have elicited need for spatially resolved approaches meet requirements computational biological interpretation. These include dealing with explosive growth determine cell-level gene-level expression, correcting inner batch effect loss improve quality, conducting efficient interpretation in-depth knowledge mining both at tissue-wide levels, multi-omics integration analysis provide an extensible framework toward understanding processes. However, algorithms designed specifically are still their infancy. Here, we review problems light corresponding issues challenges, present forward-looking insights into algorithm development.

Язык: Английский

Процитировано

71

CZ CELL×GENE Discover: A single-cell data platform for scalable exploration, analysis and modeling of aggregated data DOI Creative Commons

Shibla Abdulla,

Brian D. Aevermann, Pedro Assis

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Ноя. 2, 2023

Abstract Hundreds of millions single cells have been analyzed to date using high throughput transcriptomic methods, thanks technological advances driving the increasingly rapid generation single-cell data. This provides an exciting opportunity for unlocking new insights into health and disease, made possible by meta-analysis that span diverse datasets building on recent in large language models other machine learning approaches. Despite promise these emerging analytical tools analyzing amounts data, a major challenge remains sheer number inconsistent format, data accessibility. Many are available via unique portals platforms often lack interoperability. Here, we present CZ CellxGene Discover ( cellxgene.cziscience.com ), platform curated interoperable resource, free-to-use online portal, hosts growing corpus community contributed spans more than 50 million cells. Curated, standardized, associated with consistent cell-level metadata, this collection is largest its kind. A suite features enables accessibility reusability both computational visual interfaces allow researchers rapidly explore individual perform cross-corpus analysis. functionality enabling meta-analyses tens across studies tissues providing global views human at resolution

Язык: Английский

Процитировано

66