Universal Cell Embeddings: A Foundation Model for Cell Biology DOI Creative Commons
Yanay Rosen, Yusuf Roohani, Ayush Agrawal

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Ноя. 29, 2023

Developing a universal representation of cells which encompasses the tremendous molecular diversity cell types within human body and more generally, across species, would be transformative for biology. Recent work using single-cell transcriptomic approaches to create definitions in form atlases has provided necessary data such an endeavor. Here, we present Universal Cell Embedding (UCE) foundation model. UCE was trained on corpus atlas from other species completely self-supervised way without any annotations. offers unified biological latent space that can represent cell, regardless tissue or species. This embedding captures important variation despite presence experimental noise diverse datasets. An aspect UCE's universality is new organism mapped this with no additional labeling, model training fine-tuning. We applied Integrated Mega-scale Atlas, 36 million cells, than 1,000 uniquely named types, hundreds experiments, dozens tissues eight uncovered insights about organization space, leveraged it infer function newly discovered types. exhibits emergent behavior, uncovering biology never explicitly for, as identifying developmental lineages novel not included set. Overall, by enabling every state type, provides valuable tool analysis, annotation hypothesis generation scale single datasets continues grow.

Язык: Английский

Single-cell profiling to explore pancreatic cancer heterogeneity, plasticity and response to therapy DOI
Stefanie Bärthel, Chiara Falcomatà, Roland Rad

и другие.

Nature Cancer, Год журнала: 2023, Номер 4(4), С. 454 - 467

Опубликована: Март 23, 2023

Язык: Английский

Процитировано

49

Stabilized mosaic single-cell data integration using unshared features DOI Creative Commons
Shila Ghazanfar, Carolina Guibentif, John C. Marioni

и другие.

Nature Biotechnology, Год журнала: 2023, Номер 42(2), С. 284 - 292

Опубликована: Май 25, 2023

Currently available single-cell omics technologies capture many unique features with different biological information content. Data integration aims to place cells, captured technologies, onto a common embedding facilitate downstream analytical tasks. Current horizontal data techniques use set of features, thereby ignoring non-overlapping and losing information. Here we introduce StabMap, mosaic technique that stabilizes mapping by exploiting the features. StabMap first infers topology based on shared then projects all cells supervised or unsupervised reference coordinates traversing shortest paths along topology. We show performs well in various simulation contexts, facilitates 'multi-hop' where some datasets do not share any enables spatial gene expression for dissociated transcriptomic reference.

Язык: Английский

Процитировано

49

CellSighter: a neural network to classify cells in highly multiplexed images DOI Creative Commons
Yael Amitay, Yuval Bussi,

Ben Feinstein

и другие.

Nature Communications, Год журнала: 2023, Номер 14(1)

Опубликована: Июль 18, 2023

Abstract Multiplexed imaging enables measurement of multiple proteins in situ, offering an unprecedented opportunity to chart various cell types and states tissues. However, classification, the task identifying type individual cells, remains challenging, labor-intensive, limiting throughput. Here, we present CellSighter, a deep-learning based pipeline accelerate classification multiplexed images. Given small training set expert-labeled images, CellSighter outputs label probabilities for all cells new achieves over 80% accuracy major across platforms, which approaches inter-observer concordance. Ablation studies simulations show that is able generalize its data learn features protein expression levels, as well spatial such subcellular patterns. CellSighter’s design reduces overfitting, it can be trained with only thousands or even hundreds labeled examples. also prediction confidence, allowing downstream experts control results. Altogether, drastically hands-on time while improving consistency datasets.

Язык: Английский

Процитировано

48

Integration of Single-Cell RNA-Seq Datasets: A Review of Computational Methods DOI Open Access
Daehee Hwang, Geun Hee Han, Eun‐Soo Jung

и другие.

Molecules and Cells, Год журнала: 2023, Номер 46(2), С. 106 - 119

Опубликована: Фев. 1, 2023

With the increased number of single-cell RNA sequencing (scRNA-seq) datasets in public repositories, integrative analysis multiple scRNA-seq has become commonplace.Batch effects among different are inevitable because differences cell isolation and handling protocols, library preparation technology, platforms.To remove these batch for effective integration datasets, a methodologies have been developed based on diverse concepts approaches.These methods proven useful examining whether cellular features, such as subpopulations marker genes, identified from certain dataset, consistently present, or their conditiondependent variations, increases particular disease-related conditions, observed generated under similar distinct conditions.In this review, we summarize approaches pros cons reported previous literature.

Язык: Английский

Процитировано

45

Universal Cell Embeddings: A Foundation Model for Cell Biology DOI Creative Commons
Yanay Rosen, Yusuf Roohani, Ayush Agrawal

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Ноя. 29, 2023

Developing a universal representation of cells which encompasses the tremendous molecular diversity cell types within human body and more generally, across species, would be transformative for biology. Recent work using single-cell transcriptomic approaches to create definitions in form atlases has provided necessary data such an endeavor. Here, we present Universal Cell Embedding (UCE) foundation model. UCE was trained on corpus atlas from other species completely self-supervised way without any annotations. offers unified biological latent space that can represent cell, regardless tissue or species. This embedding captures important variation despite presence experimental noise diverse datasets. An aspect UCE's universality is new organism mapped this with no additional labeling, model training fine-tuning. We applied Integrated Mega-scale Atlas, 36 million cells, than 1,000 uniquely named types, hundreds experiments, dozens tissues eight uncovered insights about organization space, leveraged it infer function newly discovered types. exhibits emergent behavior, uncovering biology never explicitly for, as identifying developmental lineages novel not included set. Overall, by enabling every state type, provides valuable tool analysis, annotation hypothesis generation scale single datasets continues grow.

Язык: Английский

Процитировано

43