High order expression dependencies finely resolve cryptic states and subtypes in single cell data DOI Creative Commons
Abel Jansma, Yuelin Yao, Jareth C. Wolfe

и другие.

Molecular Systems Biology, Год журнала: 2025, Номер unknown

Опубликована: Янв. 2, 2025

Язык: Английский

Best practices for single-cell analysis across modalities DOI Open Access
Lukas Heumos, Anna C. Schaar, Christopher Lance

и другие.

Nature Reviews Genetics, Год журнала: 2023, Номер 24(8), С. 550 - 572

Опубликована: Март 31, 2023

Язык: Английский

Процитировано

513

Dissection of artifactual and confounding glial signatures by single-cell sequencing of mouse and human brain DOI Creative Commons
Samuel E. Marsh, Alec J. Walker, Tushar Kamath

и другие.

Nature Neuroscience, Год журнала: 2022, Номер 25(3), С. 306 - 316

Опубликована: Март 1, 2022

Язык: Английский

Процитировано

268

The specious art of single-cell genomics DOI Creative Commons
Tara Chari, Lior Pachter

PLoS Computational Biology, Год журнала: 2023, Номер 19(8), С. e1011288 - e1011288

Опубликована: Авг. 17, 2023

Dimensionality reduction is standard practice for filtering noise and identifying relevant features in large-scale data analyses. In biology, single-cell genomics studies typically begin with to 2 or 3 dimensions produce "all-in-one" visuals of the that are amenable human eye, these subsequently used qualitative quantitative exploratory analysis. However, there little theoretical support this practice, we show extreme dimension reduction, from hundreds thousands 2, inevitably induces significant distortion high-dimensional datasets. We therefore examine practical implications low-dimensional embedding find extensive distortions inconsistent practices make such embeddings counter-productive exploratory, biological lieu this, discuss alternative approaches conducting targeted feature exploration enable hypothesis-driven discovery.

Язык: Английский

Процитировано

180

Principal Component Analyses (PCA)-based findings in population genetic studies are highly biased and must be reevaluated DOI Creative Commons
Eran Elhaik

Scientific Reports, Год журнала: 2022, Номер 12(1)

Опубликована: Авг. 29, 2022

Principal Component Analysis (PCA) is a multivariate analysis that reduces the complexity of datasets while preserving data covariance. The outcome can be visualized on colorful scatterplots, ideally with only minimal loss information. PCA applications, implemented in well-cited packages like EIGENSOFT and PLINK, are extensively used as foremost analyses population genetics related fields (e.g., animal plant or medical genetics). outcomes to shape study design, identify, characterize individuals populations, draw historical ethnobiological conclusions origins, evolution, dispersion, relatedness. replicability crisis science has prompted us evaluate whether results reliable, robust, replicable. We analyzed twelve common test cases using an intuitive color-based model alongside human data. demonstrate artifacts easily manipulated generate desired outcomes. adjustment also yielded unfavorable association studies. may not replicable field assumes. Our findings raise concerns about validity reported literature place disproportionate reliance upon insights derived from them. conclude have biasing role genetic investigations 32,000-216,000 studies should reevaluated. An alternative mixed-admixture discussed.

Язык: Английский

Процитировано

139

RNA velocity unraveled DOI Creative Commons
Gennady Gorin, Meichen Fang, Tara Chari

и другие.

PLoS Computational Biology, Год журнала: 2022, Номер 18(9), С. e1010492 - e1010492

Опубликована: Сен. 12, 2022

We perform a thorough analysis of RNA velocity methods, with view towards understanding the suitability various assumptions underlying popular implementations. In addition to providing self-contained exposition mathematics, we undertake simulations and controlled experiments on biological datasets assess workflow sensitivity parameter choices biology. Finally, argue for more rigorous approach velocity, present framework Markovian that points directions improvement mitigation current problems.

Язык: Английский

Процитировано

117

Initial recommendations for performing, benchmarking and reporting single-cell proteomics experiments DOI Open Access
Laurent Gatto, Ruedi Aebersold, Jüergen Cox

и другие.

Nature Methods, Год журнала: 2023, Номер 20(3), С. 375 - 386

Опубликована: Март 1, 2023

Язык: Английский

Процитировано

104

A single-cell time-lapse of mouse prenatal development from gastrula to birth DOI Creative Commons
Chengxiang Qiu, Beth Martin, Ian Welsh

и другие.

Nature, Год журнала: 2024, Номер 626(8001), С. 1084 - 1093

Опубликована: Фев. 14, 2024

Abstract The house mouse ( Mus musculus ) is an exceptional model system, combining genetic tractability with close evolutionary affinity to humans 1,2 . Mouse gestation lasts only 3 weeks, during which the genome orchestrates astonishing transformation of a single-cell zygote into free-living pup composed more than 500 million cells. Here, establish global framework for exploring mammalian development, we applied optimized combinatorial indexing profile transcriptional states 12.4 nuclei from 83 embryos, precisely staged at 2- 6-hour intervals spanning late gastrulation (embryonic day 8) birth (postnatal 0). From these data, annotate hundreds cell types and explore ontogenesis posterior embryo somitogenesis kidney, mesenchyme, retina early neurons. We leverage temporal resolution sampling depth whole-embryo snapshots, together published data 4–8 earlier timepoints, construct rooted tree cell-type relationships that spans entirety prenatal birth. Throughout this tree, systematically nominate genes encoding transcription factors other proteins as candidate drivers in vivo differentiation types. Remarkably, most marked shifts are observed within one hour presumably underlie massive physiological adaptations must accompany successful transition fetus life outside womb.

Язык: Английский

Процитировано

58

The benefits and pitfalls of machine learning for biomarker discovery DOI Creative Commons
Sandra Ng,

Sara Masarone,

David Watson

и другие.

Cell and Tissue Research, Год журнала: 2023, Номер 394(1), С. 17 - 31

Опубликована: Июль 27, 2023

Prospects for the discovery of robust and reproducible biomarkers have improved considerably with development sensitive omics platforms that can enable measurement biological molecules at an unprecedented scale. With technical barriers to success lowering, challenge is now moving into analytical domain. Genome-wide presents a problem scale multiple testing as standard statistical methods struggle distinguish signal from noise in increasingly complex systems. Machine learning AI are good finding answers large datasets, but they tendency overfit solutions. It may be possible find local answer or mechanism specific patient sample small group samples, this not generalise wider populations due high likelihood false discovery. The rise explainable offers improve opportunity true by providing explanations predictions explored mechanistically before proceeding costly time-consuming validation studies. This review aims introduce some basic concepts machine biomarker focus on post hoc explanation predictions. To illustrate this, we consider how has already been used successfully, explore case study applies rheumatoid arthritis, demonstrating accessibility tools learning. We use discuss potential challenges solutions critically interrogate disease response mechanisms.

Язык: Английский

Процитировано

49

AnnoPRO: a strategy for protein function annotation based on multi-scale protein representation and a hybrid deep learning of dual-path encoding DOI Creative Commons
Lingyan Zheng, Shuiyang Shi, Mingkun Lu

и другие.

Genome biology, Год журнала: 2024, Номер 25(1)

Опубликована: Фев. 1, 2024

Abstract Protein function annotation has been one of the longstanding issues in biological sciences, and various computational methods have developed. However, existing suffer from a serious long-tail problem, with large number GO families containing few annotated proteins. Herein, an innovative strategy named AnnoPRO was therefore constructed by enabling sequence-based multi-scale protein representation, dual-path encoding using pre-training, long short-term memory-based decoding. A variety case studies based on different benchmarks were conducted, which confirmed superior performance among available methods. Source code models made freely at: https://github.com/idrblab/AnnoPRO https://zenodo.org/records/10012272

Язык: Английский

Процитировано

36

Best practices for the execution, analysis, and data storage of plant single-cell/nucleus transcriptomics DOI Creative Commons

Carolin Grones,

Thomas Eekhout, Dongbo Shi

и другие.

The Plant Cell, Год журнала: 2024, Номер 36(4), С. 812 - 828

Опубликована: Янв. 17, 2024

Abstract Single-cell and single-nucleus RNA-sequencing technologies capture the expression of plant genes at an unprecedented resolution. Therefore, these are gaining traction in molecular developmental biology for elucidating transcriptional changes across cell types a specific tissue or organ, upon treatments, response to biotic abiotic stresses, between genotypes. Despite rapidly accelerating use technologies, collective standardized experimental analytical procedures support acquisition high-quality data sets still missing. In this commentary, we discuss common challenges associated with single-cell transcriptomics plants propose general guidelines improve reproducibility, quality, comparability, interpretation make readily available community fast-developing field research.

Язык: Английский

Процитировано

25