Exploring the latent space of transcriptomic data with topic modeling DOI Creative Commons
Filippo Valle, Michele Caselle, Matteo Osella

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Ноя. 3, 2024

Abstract The availability of high-dimensional transcriptomic datasets is increasing at a tremendous pace, together with the need for suitable computational tools. Clustering and dimensionality reduction methods are popular go-to to identify basic structures in these datasets. At same time, different topic modeling techniques have been developed organize deluge available data natural language using their latent topical structure. This paper leverages statistical analogies between text compare when applied gene expression data. Specifically, we test accuracy specific task discovering reconstructing tissue structure human transcriptome distinguishing healthy from cancerous tissues. We examine properties space recovered by methods, highlight differences, pros cons across tasks. Finally, show that can be useful embedding space, where neural network classifier annotate profiles high accuracy.

Язык: Английский

Exploring the latent space of transcriptomic data with topic modeling DOI Creative Commons
Filippo Valle, Michele Caselle, Matteo Osella

и другие.

NAR Genomics and Bioinformatics, Год журнала: 2025, Номер 7(2)

Опубликована: Март 29, 2025

Abstract The availability of high-dimensional transcriptomic datasets is increasing at a tremendous pace, together with the need for suitable computational tools. Clustering and dimensionality reduction methods are popular go-to to identify basic structures in these datasets. At same time, different topic modeling techniques have been developed organize deluge available data natural language using their latent topical structure. This paper leverages statistical analogies between text compare when applied gene expression data. Specifically, we test accuracy specific task discovering reconstructing tissue structure human transcriptome distinguishing healthy from cancerous tissues. We examine properties space recovered by methods, highlight differences, pros cons across tasks. focus particular on how priors can affect results interpretability. Finally, show that be useful low-dimensional embedding space, where neural network classifier annotate profiles high accuracy.

Язык: Английский

Процитировано

0

Exploring the latent space of transcriptomic data with topic modeling DOI Creative Commons
Filippo Valle, Michele Caselle, Matteo Osella

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Ноя. 3, 2024

Abstract The availability of high-dimensional transcriptomic datasets is increasing at a tremendous pace, together with the need for suitable computational tools. Clustering and dimensionality reduction methods are popular go-to to identify basic structures in these datasets. At same time, different topic modeling techniques have been developed organize deluge available data natural language using their latent topical structure. This paper leverages statistical analogies between text compare when applied gene expression data. Specifically, we test accuracy specific task discovering reconstructing tissue structure human transcriptome distinguishing healthy from cancerous tissues. We examine properties space recovered by methods, highlight differences, pros cons across tasks. Finally, show that can be useful embedding space, where neural network classifier annotate profiles high accuracy.

Язык: Английский

Процитировано

2