On the Mathematics of RNA Velocity II: Algorithmic Aspects DOI Creative Commons
Tiejun Li, Yizhuo Wang,

Guoguo Yang

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: June 11, 2023

Abstract In a previous paper [CSIAM Trans. Appl. Math. 2 (2021), 1-55], the authors proposed theoretical framework for analysis of RNA velocity, which is promising concept in scRNA-seq data to reveal cell state-transition dynamical processes underlying snapshot data. The current devoted algorithmic study some key components velocity workflow. Four important points are addressed this paper: (1) We construct rational time-scale fixation method can determine global gene-shared latent time cells. (2) present an uncertainty quantification strategy inferred parameters obtained through EM algorithm. (3) establish optimal criterion choice kernel bandwidth with respect sample size downstream and discuss its implications. (4) propose temporal distance estimation approach between two clusters along cellular development path. Some illustrative numerical tests also carried out verify our analysis. These results intended provide tools insights further type methods future.

Language: Английский

RNA velocity unraveled DOI Creative Commons
Gennady Gorin, Meichen Fang, Tara Chari

et al.

PLoS Computational Biology, Journal Year: 2022, Volume and Issue: 18(9), P. e1010492 - e1010492

Published: Sept. 12, 2022

We perform a thorough analysis of RNA velocity methods, with view towards understanding the suitability various assumptions underlying popular implementations. In addition to providing self-contained exposition mathematics, we undertake simulations and controlled experiments on biological datasets assess workflow sensitivity parameter choices biology. Finally, argue for more rigorous approach velocity, present framework Markovian that points directions improvement mitigation current problems.

Language: Английский

Citations

117

Interpretable and tractable models of transcriptional noise for the rational design of single-molecule quantification experiments DOI Creative Commons
Gennady Gorin, John J. Vastola, Meichen Fang

et al.

Nature Communications, Journal Year: 2022, Volume and Issue: 13(1)

Published: Dec. 9, 2022

The question of how cell-to-cell differences in transcription rate affect RNA count distributions is fundamental for understanding biological processes underlying transcription. Answering this requires quantitative models that are both interpretable (describing concrete biophysical phenomena) and tractable (amenable to mathematical analysis). This enables the identification experiments which best discriminate between competing hypotheses. As a proof principle, we introduce simple but flexible class involving continuous stochastic driving discrete splicing process, compare contrast two biologically plausible hypotheses about variation. One assumes variation due DNA experiencing mechanical strain, while other it regulator number fluctuations. We framework numerically analytically studying such models, apply Bayesian model selection identify candidate genes show signatures each single-cell transcriptomic data from mouse glutamatergic neurons.

Language: Английский

Citations

41

Genome-wide inference reveals that feedback regulations constrain promoter-dependent transcriptional burst kinetics DOI Creative Commons
Songhao Luo, Zihao Wang, Zhenquan Zhang

et al.

Nucleic Acids Research, Journal Year: 2022, Volume and Issue: 51(1), P. 68 - 83

Published: Dec. 30, 2022

Gene expression in mammalian cells is highly variable and episodic, resulting a series of discontinuous bursts mRNAs. A challenge to understand how static promoter architecture dynamic feedback regulations dictate bursting on genome-wide scale. Although single-cell RNA sequencing (scRNA-seq) provides an opportunity address this challenge, effective analytical methods are scarce. We developed interpretable scalable inference framework, which combined experimental data with mechanistic model infer transcriptional burst kinetics (sizes frequencies) regulations. Applying framework scRNA-seq generated from embryonic mouse fibroblast cells, we found Simpson's paradoxes, i.e. exhibit different characteristics two cases without distinguishing also showed that feedbacks differently modulate frequencies sizes conceal the effects transcription start site distributions kinetics. Notably, only presence positive feedback, TATA genes expressed high enhancer-promoter interactions mainly frequencies. The method provided flexible efficient way investigate obtained results would be helpful for understanding cell development fate decision.

Language: Английский

Citations

32

Single-cell and long-read sequencing to enhance modelling of splicing and cell-fate determination DOI Creative Commons
Siyuan Wu, Ulf Schmitz

Computational and Structural Biotechnology Journal, Journal Year: 2023, Volume and Issue: 21, P. 2373 - 2380

Published: Jan. 1, 2023

Single-cell sequencing technologies have revolutionised the life sciences and biomedical research. provides high-resolution data on cell heterogeneity, allowing high-fidelity type identification, lineage tracking. Computational algorithms mathematical models been developed to make sense of data, compensate for errors simulate biological processes, which has led breakthroughs in our understanding differentiation, cell-fate determination tissue composition. The development long-read (a.k.a. third-generation) produced powerful tools investigating alternative splicing, isoform expression (at RNA level), genome assembly detection complex structural variants DNA level).In this review, we provide an overview recent advancements single-cell technologies, with a particular focus computational that help correcting, analysing, interpreting resulting data. Additionally, review some use study respectively. Moreover, highlight emerging opportunities modelling result from combination technologies.

Language: Английский

Citations

20

Inferring transcriptional bursting kinetics from single-cell snapshot data using a generalized telegraph model DOI Creative Commons
Songhao Luo, Zhenquan Zhang, Zihao Wang

et al.

Royal Society Open Science, Journal Year: 2023, Volume and Issue: 10(4)

Published: April 1, 2023

Gene expression has inherent stochasticity resulting from transcription's burst manners. Single-cell snapshot data can be exploited to rigorously infer transcriptional kinetics, using mathematical models as blueprints. The classical telegraph model (CTM) been widely used explain bursting with Markovian assumptions. However, growing evidence suggests that the gene-state dwell times are generally non-exponential, switching is a multi-step process in organisms. Therefore, interpretable non-Markovian and efficient statistical inference methods urgently required investigating kinetics. We develop an tractable model, generalized (GTM), characterize allows arbitrary dwell-time distributions, rather than exponential incorporated into ON OFF process. Based on GTM, we propose method for kinetics approximate Bayesian computation framework. This demonstrates scalable estimation of frequency size synthetic data. Further, application genome-wide mouse embryonic fibroblasts reveals GTM would estimate lower higher those estimated by CTM. In conclusion, corresponding effective tools dynamic static single-cell

Language: Английский

Citations

20

Studying stochastic systems biology of the cell with single-cell genomics data DOI Creative Commons
Gennady Gorin, John J. Vastola, Lior Pachter

et al.

Cell Systems, Journal Year: 2023, Volume and Issue: 14(10), P. 822 - 843.e22

Published: Sept. 25, 2023

Language: Английский

Citations

19

Length biases in single-cell RNA sequencing of pre-mRNA DOI Creative Commons
Gennady Gorin, Lior Pachter

Biophysical Reports, Journal Year: 2022, Volume and Issue: 3(1), P. 100097 - 100097

Published: Dec. 27, 2022

Single-cell RNA sequencing data can be modeled using Markov chains to yield genome-wide insights into transcriptional physics. However, quantitative inference with such requires careful assessment of noise sources. We find that long pre-mRNA transcripts are over-represented in data. To explain this trend, we propose a length-based model capture bias, which may produce false-positive observations. solve and use it concordant parameter trends as well systematic, mechanistically interpretable technical biological differences paired sets.

Language: Английский

Citations

23

Quantifying and correcting bias in transcriptional parameter inference from single-cell data DOI Creative Commons
Ramon Grima,

Pierre-Marie Esmenjaud

Biophysical Journal, Journal Year: 2023, Volume and Issue: 123(1), P. 4 - 30

Published: Oct. 27, 2023

The snapshot distribution of mRNA counts per cell can be measured using single-molecule fluorescence in situ hybridization or single-cell RNA sequencing. These distributions are often fit to the steady-state two-state telegraph model estimate three transcriptional parameters for a gene interest: synthesis rate, switching on rate (the state being active state), and off rate. This assumes no extrinsic noise, i.e., do not vary between cells, thus estimated understood as approximating average values population. accuracy this approximation is currently unclear. Here, we develop theory that explains size sign estimation bias when inferring from data standard model. We find specific signatures depending source noise (which parameter most variable across cells) mode activity. If expression bursty then population averages all overestimated if rate; underestimation occurs both overestimation occur some tend infinity approaches critical threshold. In contrast bursty, cases mean burst (ratio rate) while frequency underestimated. covariance matrix sequencing use together with our correct published estimates mammalian genes.

Language: Английский

Citations

15

Solving stochastic gene-expression models using queueing theory: A tutorial review DOI Creative Commons
Juraj Szavits-Nossan, Ramon Grima

Biophysical Journal, Journal Year: 2024, Volume and Issue: 123(9), P. 1034 - 1057

Published: April 9, 2024

Stochastic models of gene expression are typically formulated using the chemical master equation, which can be solved exactly or approximately a repertoire analytical methods. Here, we provide tutorial review an alternative approach based on queueing theory that has rarely been used in literature expression. We discuss interpretation six types infinite-server queues from angle stochastic single-cell biology and expressions for stationary nonstationary distributions and/or moments mRNA/protein numbers bounds Fano factor. This may enable solution complex have hitherto evaded solution.

Language: Английский

Citations

5

Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data DOI Creative Commons
Maria Carilli, Gennady Gorin, Yongin Choi

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: Jan. 14, 2023

Abstract We motivate and present biVI , which combines the variational autoencoder framework of scVI with biophysically motivated, bivariate models for nascent mature RNA distributions. While previous approaches to integrate bimodal data via ignore causal relationship between measurements, biophysical processes that give rise observations. demonstrate through simulated benchmarking captures cell type structure in a low-dimensional space accurately recapitulates parameter values copy number On biological data, provides scalable route identifying mechanisms underlying gene expression. This analytical approach outlines generalizable strateg treating multimodal datasets generated by high-throughput, single-cell genomic assays.

Language: Английский

Citations

12