Deep learning modeling of rare noncoding genetic variants in human motor neurons definesCCDC146as a therapeutic target for ALS DOI Creative Commons
Sai Zhang, Tobias Moll,

Jasper Rubin-Sigler

et al.

medRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: April 1, 2024

Amyotrophic lateral sclerosis (ALS) is a fatal and incurable neurodegenerative disease caused by the selective progressive death of motor neurons (MNs). Understanding genetic molecular factors influencing ALS survival crucial for management therapeutics. In this study, we introduce deep learning-powered analysis framework to link rare noncoding variants survival. Using data from human induced pluripotent stem cell (iPSC)-derived MNs, method prioritizes functional using learning, links cis-regulatory elements (CREs) target genes epigenomics data, integrates these through gene-level burden tests identify survival-modifying variants, CREs, genes. We apply approach analyze 6,715 genomes, pinpoint four novel associated with survival, including chr7:76,009,472:C>T linked CCDC146 . CRISPR-Cas9 editing variant increases expression in iPSC-derived MNs exacerbates ALS-specific phenotypes, TDP-43 mislocalization. Suppressing an antisense oligonucleotide (ASO), showing no toxicity, completely rescues ALS-associated defects derived sporadic patients carriers G4C2-repeat expansion within C9ORF72 ASO targeting may be broadly effective therapeutic ALS. Our provides generic powerful studying genetics complex diseases.

Language: Английский

JASPAR 2024: 20th anniversary of the open-access database of transcription factor binding profiles DOI Creative Commons
Ieva Rauluševičiūtė, Rafael Riudavets Puig, Romain Blanc‐Mathieu

et al.

Nucleic Acids Research, Journal Year: 2023, Volume and Issue: 52(D1), P. D174 - D182

Published: Nov. 14, 2023

JASPAR (https://jaspar.elixir.no/) is a widely-used open-access database presenting manually curated high-quality and non-redundant DNA-binding profiles for transcription factors (TFs) across taxa. In this 10th release 20th-anniversary update, the CORE collection has expanded with 329 new profiles. We updated three existing provided orthogonal support 72 from previous release's UNVALIDATED collection. Altogether, 2024 update provides 20% increase in release. A trimming algorithm enhanced by removing low information content flanking base pairs, which were likely uninformative (within capacity of PFM models) TFBS predictions modelling TF-DNA interactions. This includes metadata, featuring refined classification plant TFs' structural domains. The collections prompt updates to genomic tracks predicted TF binding sites (TFBSs) 8 organisms, human mouse available as native UCSC Genome browser. All data are through web interface programmatically its API Bioconductor pyJASPAR packages. Finally, extraction tool enables users retrieve TFBSs intersecting their regions interest.

Language: Английский

Citations

331

Gene regulatory network inference in the era of single-cell multi-omics DOI
Pau Badia-i-Mompel, Lorna Wessels, Sophia Müller‐Dott

et al.

Nature Reviews Genetics, Journal Year: 2023, Volume and Issue: 24(11), P. 739 - 754

Published: June 26, 2023

Language: Английский

Citations

187

Cell-type-directed design of synthetic enhancers DOI Creative Commons
Ibrahim Ihsan Taskiran, Katina I. Spanier, Hannah Dickmänken

et al.

Nature, Journal Year: 2023, Volume and Issue: 626(7997), P. 212 - 220

Published: Dec. 12, 2023

Transcriptional enhancers act as docking stations for combinations of transcription factors and thereby regulate spatiotemporal activation their target genes

Language: Английский

Citations

68

Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings DOI
Alexander Sasse, Bernard Ng, Anna Spiro

et al.

Nature Genetics, Journal Year: 2023, Volume and Issue: 55(12), P. 2060 - 2064

Published: Nov. 30, 2023

Language: Английский

Citations

47

A fast, scalable and versatile tool for analysis of single-cell omics data DOI Creative Commons
Kai Zhang, Nathan R. Zemke, Ethan J. Armand

et al.

Nature Methods, Journal Year: 2024, Volume and Issue: 21(2), P. 217 - 227

Published: Jan. 8, 2024

Abstract Single-cell omics technologies have revolutionized the study of gene regulation in complex tissues. A major computational challenge analyzing these datasets is to project large-scale and high-dimensional data into low-dimensional space while retaining relative relationships between cells. This low dimension embedding necessary decompose cellular heterogeneity reconstruct cell-type-specific regulatory programs. Traditional dimensionality reduction techniques, however, face challenges efficiency comprehensively addressing diversity across varied molecular modalities. Here we introduce a nonlinear algorithm, embodied Python package SnapATAC2, which not only achieves more precise capture single-cell heterogeneities but also ensures efficient runtime memory usage, scaling linearly with number Our algorithm demonstrates exceptional performance, scalability versatility diverse datasets, including assay for transposase-accessible chromatin using sequencing, RNA Hi-C multi-omics underscoring its utility advancing analysis.

Language: Английский

Citations

45

Machine Learning and Deep Learning in Synthetic Biology: Key Architectures, Applications, and Challenges DOI Creative Commons
Manoj Kumar Goshisht

ACS Omega, Journal Year: 2024, Volume and Issue: 9(9), P. 9921 - 9945

Published: Feb. 19, 2024

Machine learning (ML), particularly deep (DL), has made rapid and substantial progress in synthetic biology recent years. Biotechnological applications of biosystems, including pathways, enzymes, whole cells, are being probed frequently with time. The intricacy interconnectedness biosystems make it challenging to design them the desired properties. ML DL have a synergy biology. Synthetic can be employed produce large data sets for training models (for instance, by utilizing DNA synthesis), ML/DL inform example, generating new parts or advising unrivaled experiments perform). This potential recently been brought light research at intersection engineering through achievements like novel biological components, best experimental design, automated analysis microscopy data, protein structure prediction, biomolecular implementations ANNs (Artificial Neural Networks). I divided this review into three sections. In first section, describe predictive basics along myriad biology, especially activity proteins, metabolic pathways. second fundamental architectures their Finally, different challenges causing hurdles solutions.

Language: Английский

Citations

27

Deciphering the impact of genomic variation on function DOI
J Engreitz, Heather A. Lawson, Harinder Singh

et al.

Nature, Journal Year: 2024, Volume and Issue: 633(8028), P. 47 - 57

Published: Sept. 4, 2024

Language: Английский

Citations

20

Single-cell multi-ome regression models identify functional and disease-associated enhancers and enable chromatin potential analysis DOI Creative Commons
Sneha Mitra, Rohan Malik, Wilfred Wong

et al.

Nature Genetics, Journal Year: 2024, Volume and Issue: 56(4), P. 627 - 636

Published: March 21, 2024

Abstract We present a gene-level regulatory model, single-cell ATAC + RNA linking (SCARlink), which predicts gene expression and links enhancers to target genes using multi-ome (scRNA-seq scATAC–seq co-assay) sequencing data. The approach uses regularized Poisson regression on tile-level accessibility data jointly model all effects at locus, avoiding the limitations of pairwise gene–peak correlations dependence peak calling. SCARlink outperformed existing scoring methods for imputing from chromatin across high-coverage datasets while giving comparable improved performance low-coverage datasets. Shapley value analysis trained models identified cell-type-specific that are validated by promoter capture Hi-C 11× 15× 5× 12× enriched in fine-mapped eQTLs genome-wide association study (GWAS) variants, respectively. further show SCARlink-predicted observed vectors provide robust way compute potential vector field enable developmental trajectory analysis.

Language: Английский

Citations

18

Challenges and best practices in omics benchmarking DOI
Thomas G. Brooks, Nicholas F. Lahens,

Antonijo Mrčela

et al.

Nature Reviews Genetics, Journal Year: 2024, Volume and Issue: 25(5), P. 326 - 339

Published: Jan. 12, 2024

Language: Английский

Citations

17

Interpreting non-coding disease-associated human variants using single-cell epigenomics DOI
Kyle J. Gaulton, Sebastian Preißl, Bing Ren

et al.

Nature Reviews Genetics, Journal Year: 2023, Volume and Issue: 24(8), P. 516 - 534

Published: May 9, 2023

Language: Английский

Citations

39