Deaminase-mediated chromatin accessibility profiling with single-allele resolution DOI Open Access
Tian Yu, Zhijian Li, Ellie Gibbs

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 20, 2024

Abstract Binding of transcription factors (TFs) at gene regulatory elements controls cellular epigenetic state and expression. Current genome-wide chromatin profiling approaches have inherently limited resolution, complicating assessment TF occupancy co-occupancy, especially individual alleles. In this work, we introduce Accessible Chromatin by Cytosine Editing Site Sequencing with ATAC-seq (ACCESS-ATAC), which harnesses a double-stranded DNA cytosine deaminase (Ddd) enzyme to stencil binding locations within accessible regions. We optimize bulk single-cell ACCESS-ATAC protocols develop computational methods show that the increased resolution compared improves accuracy site prediction. use perform allelic co-occupancy imputation for 64 TFs each in HepG2 K562, revealing propensity majority co-occupy nearby motifs oscillates period approximating helical turn DNA. Altogether, expands capabilities epigenomic profiling.

Language: Английский

Deciphering the impact of genomic variation on function DOI
J Engreitz, Heather A. Lawson, Harinder Singh

et al.

Nature, Journal Year: 2024, Volume and Issue: 633(8028), P. 47 - 57

Published: Sept. 4, 2024

Language: Английский

Citations

22

A foundation model of transcription across human cell types DOI Creative Commons
Xi Fu, Shentong Mo,

Alejandro Buendia

et al.

Nature, Journal Year: 2025, Volume and Issue: 637(8047), P. 965 - 973

Published: Jan. 8, 2025

Transcriptional regulation, which involves a complex interplay between regulatory sequences and proteins, directs all biological processes. Computational models of transcription lack generalizability to accurately extrapolate unseen cell types conditions. Here we introduce GET (general expression transformer), an interpretable foundation model designed uncover grammars across 213 human fetal adult types1,2. Relying exclusively on chromatin accessibility data sequence information, achieves experimental-level accuracy in predicting gene even previously types3. also shows remarkable adaptability new sequencing platforms assays, enabling inference broad range conditions, uncovers universal cell-type-specific factor interaction networks. We evaluated its performance prediction activity, elements regulators, identification physical interactions factors found that it outperforms current models4 lentivirus-based massively parallel reporter assay readout5,6. In erythroblasts7, identified distal (greater than 1 Mbp) regions were missed by previous models, and, B cells, lymphocyte-specific factor-transcription explains the functional significance leukaemia risk predisposing germline mutation8-10. sum, provide generalizable accurate for together with catalogues regulation interactions, type specificity.

Language: Английский

Citations

8

ChromBPNet: bias factorized, base-resolution deep learning models of chromatin accessibility reveal cis-regulatory sequence syntax, transcription factor footprints and regulatory variants DOI Creative Commons
Anusri Pampari, Anna Shcherbina, Evgeny Z. Kvon

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 25, 2024

Despite extensive mapping of cis-regulatory elements (cREs) across cellular contexts with chromatin accessibility assays, the sequence syntax and genetic variants that regulate transcription factor (TF) binding at context-specific cREs remain elusive. We introduce ChromBPNet, a deep learning DNA model base-resolution profiles detects, learns deconvolves assay-specific enzyme biases from regulatory determinants accessibility, enabling robust discovery compact TF motif lexicons, cooperative precision footprints assays sequencing depths. Extensive benchmarks show despite its lightweight design, is competitive much larger contemporary models predicting variant effects on pioneer reporter activity cell ancestry, while providing interpretation disrupted syntax. ChromBPNet also helps prioritize interpret influence complex traits rare diseases, thereby powerful lens to decode variation.

Language: Английский

Citations

11

Efficient, scalable, and near-nucleotide-resolution profiling of protein occupancy in the genome with deaminases DOI Creative Commons
Lei Chang, Bing Ren

Proceedings of the National Academy of Sciences, Journal Year: 2025, Volume and Issue: 122(5)

Published: Jan. 27, 2025

Postdoctoral training is a career stage often described as demanding and anxiety-laden time when many promising PhDs see their academic dreams slip away due to circumstances beyond control. We use unique dataset of publishing ...

Language: Английский

Citations

1

ChromatinHD connects single-cell DNA accessibility and conformation to gene expression through scale-adaptive machine learning DOI Creative Commons
Wouter Saelens, Olga Pushkarev, Bart Deplancke

et al.

Nature Communications, Journal Year: 2025, Volume and Issue: 16(1)

Published: Jan. 2, 2025

Language: Английский

Citations

0

Single-cell technology for plant systems biology DOI
Sahand Amini,

Sandra Thibivilliers,

Andrew Farmer

et al.

Elsevier eBooks, Journal Year: 2025, Volume and Issue: unknown, P. 133 - 156

Published: Jan. 1, 2025

Language: Английский

Citations

0

Ocelli: an open-source tool for the analysis and visualization of developmental multimodal single-cell data DOI Creative Commons
Piotr Rutkowski, Marcin Tabaka

NAR Genomics and Bioinformatics, Journal Year: 2025, Volume and Issue: 7(2)

Published: March 29, 2025

The recent expansion of single-cell technologies has enabled simultaneous genome-wide measurements multiple modalities in the same single cell. potential to jointly profile such as gene expression, chromatin accessibility, protein epitopes, or histone modifications at resolution represents a compelling opportunity study developmental processes layers regulation. Here, we present Ocelli, lightweight Python package implemented Ray for scalable visualization and analysis multimodal data. core functionality Ocelli focuses on diffusion-based modeling biological involving cell state transitions. addresses common tasks data analysis, cells low-dimensional embedding that preserves continuity progression cells, identification rare transient states, integration with trajectory inference algorithms, imputation undetected feature counts. Extensive benchmarking shows outperforms existing methods regarding computational time quality reconstructed representation

Language: Английский

Citations

0

Algorithms for a Commons Cell Atlas DOI Creative Commons
A. Sina Booeshaghi, Ángel Gálvez-Merchán, Lior Pachter

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: March 26, 2024

Abstract Cell atlas projects curate representative datasets, cell types, and marker genes for tissues across an organism. Despite their ubiquity, rely on duplicated manual effort to annotate types. The size of atlases coupled with a lack data-compatible tools make reprocessing analysis data near-impossible. To overcome these challenges, we present collection data, algorithms, automate cataloging analyzing types in organism, demonstrate its utility building human atlas.

Language: Английский

Citations

3

Long-read sequencing transcriptome quantification with lr-kallisto DOI Creative Commons
Rebekah K. Loving, Delaney K. Sullivan, Fairlie Reese

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: July 19, 2024

RNA abundance quantification has become routine and affordable thanks to high-throughput “short-read” technologies that provide accurate molecule counts at the gene level. Similarly of definitive fulllength, transcript isoforms remained a stubborn challenge, despite its obvious biological significance across wide range problems. “Long-read” sequencing platforms now produce data-types can, in principle, drive isoform quantification. However some particulars contemporary long-read datatypes, together with complexity genetic variation, present bioinformatic challenges. We show here, using ONT data, fast data is possible it improved by exome capture. To perform quantifications we developed lr-kallisto, which adapts kallisto bulk single-cell RNA-seq methods for technologies.

Language: Английский

Citations

3

Uniform quantification of single-nucleus ATAC-seq data with Paired-Insertion Counting (PIC) and a model-based insertion rate estimator DOI Creative Commons
Zhen Miao, Junhyong Kim

Nature Methods, Journal Year: 2023, Volume and Issue: 21(1), P. 32 - 36

Published: Dec. 4, 2023

Existing approaches to scoring single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) feature matrices from reads are inconsistent, affecting downstream analyses and displaying artifacts. We show that, even sparse single-cell data, quantitative counts informative estimating the regulatory state of a cell, which calls consistent treatment. propose Paired-Insertion Counting as uniform method snATAC-seq characterization provide probability model inferring latent insertion dynamics count matrices.

Language: Английский

Citations

8