
Molecular Systems Biology, Год журнала: 2025, Номер unknown
Опубликована: Янв. 2, 2025
Язык: Английский
Molecular Systems Biology, Год журнала: 2025, Номер unknown
Опубликована: Янв. 2, 2025
Язык: Английский
Nature Reviews Genetics, Год журнала: 2023, Номер 24(8), С. 550 - 572
Опубликована: Март 31, 2023
Язык: Английский
Процитировано
513Nature Neuroscience, Год журнала: 2022, Номер 25(3), С. 306 - 316
Опубликована: Март 1, 2022
Язык: Английский
Процитировано
268PLoS Computational Biology, Год журнала: 2023, Номер 19(8), С. e1011288 - e1011288
Опубликована: Авг. 17, 2023
Dimensionality reduction is standard practice for filtering noise and identifying relevant features in large-scale data analyses. In biology, single-cell genomics studies typically begin with to 2 or 3 dimensions produce "all-in-one" visuals of the that are amenable human eye, these subsequently used qualitative quantitative exploratory analysis. However, there little theoretical support this practice, we show extreme dimension reduction, from hundreds thousands 2, inevitably induces significant distortion high-dimensional datasets. We therefore examine practical implications low-dimensional embedding find extensive distortions inconsistent practices make such embeddings counter-productive exploratory, biological lieu this, discuss alternative approaches conducting targeted feature exploration enable hypothesis-driven discovery.
Язык: Английский
Процитировано
180Scientific Reports, Год журнала: 2022, Номер 12(1)
Опубликована: Авг. 29, 2022
Principal Component Analysis (PCA) is a multivariate analysis that reduces the complexity of datasets while preserving data covariance. The outcome can be visualized on colorful scatterplots, ideally with only minimal loss information. PCA applications, implemented in well-cited packages like EIGENSOFT and PLINK, are extensively used as foremost analyses population genetics related fields (e.g., animal plant or medical genetics). outcomes to shape study design, identify, characterize individuals populations, draw historical ethnobiological conclusions origins, evolution, dispersion, relatedness. replicability crisis science has prompted us evaluate whether results reliable, robust, replicable. We analyzed twelve common test cases using an intuitive color-based model alongside human data. demonstrate artifacts easily manipulated generate desired outcomes. adjustment also yielded unfavorable association studies. may not replicable field assumes. Our findings raise concerns about validity reported literature place disproportionate reliance upon insights derived from them. conclude have biasing role genetic investigations 32,000-216,000 studies should reevaluated. An alternative mixed-admixture discussed.
Язык: Английский
Процитировано
139PLoS Computational Biology, Год журнала: 2022, Номер 18(9), С. e1010492 - e1010492
Опубликована: Сен. 12, 2022
We perform a thorough analysis of RNA velocity methods, with view towards understanding the suitability various assumptions underlying popular implementations. In addition to providing self-contained exposition mathematics, we undertake simulations and controlled experiments on biological datasets assess workflow sensitivity parameter choices biology. Finally, argue for more rigorous approach velocity, present framework Markovian that points directions improvement mitigation current problems.
Язык: Английский
Процитировано
117Nature Methods, Год журнала: 2023, Номер 20(3), С. 375 - 386
Опубликована: Март 1, 2023
Язык: Английский
Процитировано
104Nature, Год журнала: 2024, Номер 626(8001), С. 1084 - 1093
Опубликована: Фев. 14, 2024
Abstract The house mouse ( Mus musculus ) is an exceptional model system, combining genetic tractability with close evolutionary affinity to humans 1,2 . Mouse gestation lasts only 3 weeks, during which the genome orchestrates astonishing transformation of a single-cell zygote into free-living pup composed more than 500 million cells. Here, establish global framework for exploring mammalian development, we applied optimized combinatorial indexing profile transcriptional states 12.4 nuclei from 83 embryos, precisely staged at 2- 6-hour intervals spanning late gastrulation (embryonic day 8) birth (postnatal 0). From these data, annotate hundreds cell types and explore ontogenesis posterior embryo somitogenesis kidney, mesenchyme, retina early neurons. We leverage temporal resolution sampling depth whole-embryo snapshots, together published data 4–8 earlier timepoints, construct rooted tree cell-type relationships that spans entirety prenatal birth. Throughout this tree, systematically nominate genes encoding transcription factors other proteins as candidate drivers in vivo differentiation types. Remarkably, most marked shifts are observed within one hour presumably underlie massive physiological adaptations must accompany successful transition fetus life outside womb.
Язык: Английский
Процитировано
58Cell and Tissue Research, Год журнала: 2023, Номер 394(1), С. 17 - 31
Опубликована: Июль 27, 2023
Prospects for the discovery of robust and reproducible biomarkers have improved considerably with development sensitive omics platforms that can enable measurement biological molecules at an unprecedented scale. With technical barriers to success lowering, challenge is now moving into analytical domain. Genome-wide presents a problem scale multiple testing as standard statistical methods struggle distinguish signal from noise in increasingly complex systems. Machine learning AI are good finding answers large datasets, but they tendency overfit solutions. It may be possible find local answer or mechanism specific patient sample small group samples, this not generalise wider populations due high likelihood false discovery. The rise explainable offers improve opportunity true by providing explanations predictions explored mechanistically before proceeding costly time-consuming validation studies. This review aims introduce some basic concepts machine biomarker focus on post hoc explanation predictions. To illustrate this, we consider how has already been used successfully, explore case study applies rheumatoid arthritis, demonstrating accessibility tools learning. We use discuss potential challenges solutions critically interrogate disease response mechanisms.
Язык: Английский
Процитировано
49Genome biology, Год журнала: 2024, Номер 25(1)
Опубликована: Фев. 1, 2024
Abstract Protein function annotation has been one of the longstanding issues in biological sciences, and various computational methods have developed. However, existing suffer from a serious long-tail problem, with large number GO families containing few annotated proteins. Herein, an innovative strategy named AnnoPRO was therefore constructed by enabling sequence-based multi-scale protein representation, dual-path encoding using pre-training, long short-term memory-based decoding. A variety case studies based on different benchmarks were conducted, which confirmed superior performance among available methods. Source code models made freely at: https://github.com/idrblab/AnnoPRO https://zenodo.org/records/10012272
Язык: Английский
Процитировано
36The Plant Cell, Год журнала: 2024, Номер 36(4), С. 812 - 828
Опубликована: Янв. 17, 2024
Abstract Single-cell and single-nucleus RNA-sequencing technologies capture the expression of plant genes at an unprecedented resolution. Therefore, these are gaining traction in molecular developmental biology for elucidating transcriptional changes across cell types a specific tissue or organ, upon treatments, response to biotic abiotic stresses, between genotypes. Despite rapidly accelerating use technologies, collective standardized experimental analytical procedures support acquisition high-quality data sets still missing. In this commentary, we discuss common challenges associated with single-cell transcriptomics plants propose general guidelines improve reproducibility, quality, comparability, interpretation make readily available community fast-developing field research.
Язык: Английский
Процитировано
25