Relating enhancer genetic variation across mammals to complex phenotypes using machine learning DOI Creative Commons
Irene M. Kaplow, Alyssa J. Lawler, Daniel E. Schäffer

et al.

Science, Journal Year: 2023, Volume and Issue: 380(6643)

Published: April 27, 2023

Protein-coding differences between species often fail to explain phenotypic diversity, suggesting the involvement of genomic elements that regulate gene expression such as enhancers. Identifying associations enhancers and phenotypes is challenging because enhancer activity can be tissue-dependent functionally conserved despite low sequence conservation. We developed Tissue-Aware Conservation Inference Toolkit (TACIT) associate candidate with species' using predictions from machine learning models trained on specific tissues. Applying TACIT motor cortex parvalbumin-positive interneuron neurological revealed dozens enhancer-phenotype associations, including brain size-associated interact genes implicated in microcephaly or macrocephaly. provides a foundation for identifying associated evolution any convergently evolved phenotype large group aligned genomes.

Language: Английский

ArchR is a scalable software package for integrative single-cell chromatin accessibility analysis DOI Creative Commons
Jeffrey M. Granja, M. Ryan Corces, Sarah E. Pierce

et al.

Nature Genetics, Journal Year: 2021, Volume and Issue: 53(3), P. 403 - 411

Published: Feb. 25, 2021

Abstract The advent of single-cell chromatin accessibility profiling has accelerated the ability to map gene regulatory landscapes but outpaced development scalable software rapidly extract biological meaning from these data. Here we present a suite for analysis in R (ArchR; https://www.archrproject.com/ ) that enables fast and comprehensive ArchR provides an intuitive, user-focused interface complex analyses, including doublet removal, clustering cell type identification, unified peak set generation, cellular trajectory DNA element-to-gene linkage, transcription factor footprinting, mRNA expression level prediction multi-omic integration with RNA sequencing (scRNA-seq). Enabling over 1.2 million single cells within 8 h on standard Unix laptop, is end-to-end will accelerate understanding regulation at resolution individual cells.

Language: Английский

Citations

1001

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome DOI
Yanrong Ji, Zhihan Zhou, Han Liu

et al.

Bioinformatics, Journal Year: 2021, Volume and Issue: 37(15), P. 2112 - 2120

Published: Feb. 3, 2021

Abstract Motivation Deciphering the language of non-coding DNA is one fundamental problems in genome research. Gene regulatory code highly complex due to existence polysemy and distant semantic relationship, which previous informatics methods often fail capture especially data-scarce scenarios. Results To address this challenge, we developed a novel pre-trained bidirectional encoder representation, named DNABERT, global transferrable understanding genomic sequences based on up downstream nucleotide contexts. We compared DNABERT most widely used programs for genome-wide elements prediction demonstrate its ease use, accuracy efficiency. show that single transformers model can simultaneously achieve state-of-the-art performance promoters, splice sites transcription factor binding sites, after easy fine-tuning using small task-specific labeled data. Further, enables direct visualization nucleotide-level importance relationship within input better interpretability accurate identification conserved sequence motifs functional genetic variant candidates. Finally, with human even be readily applied other organisms exceptional performance. anticipate fined tuned many analyses tasks. Availability implementation The source code, pretrained finetuned are available at GitHub (https://github.com/jerryji1993/DNABERT). Supplementary information data Bioinformatics online.

Language: Английский

Citations

613

Modulation of cellular processes by histone and non-histone protein acetylation DOI
Maria Shvedunova, Asifa Akhtar

Nature Reviews Molecular Cell Biology, Journal Year: 2022, Volume and Issue: 23(5), P. 329 - 349

Published: Jan. 18, 2022

Language: Английский

Citations

541

Enhancer RNAs are an important regulatory layer of the epigenome DOI
Vittorio Sartorelli, Shannon Lauberth

Nature Structural & Molecular Biology, Journal Year: 2020, Volume and Issue: 27(6), P. 521 - 528

Published: June 1, 2020

Language: Английский

Citations

280

A cis-regulatory atlas in maize at single-cell resolution DOI Creative Commons
Alexandre P. Marand, Zongliang Chen, Andrea Gallavotti

et al.

Cell, Journal Year: 2021, Volume and Issue: 184(11), P. 3041 - 3055.e21

Published: May 1, 2021

Language: Английский

Citations

272

Mechanisms of enhancer action: the known and the unknown DOI Creative Commons

Anil K. Panigrahi,

Bert W. O’Malley

Genome biology, Journal Year: 2021, Volume and Issue: 22(1)

Published: April 15, 2021

Abstract Differential gene expression mechanisms ensure cellular differentiation and plasticity to shape ontogenetic phylogenetic diversity of cell types. A key regulator differential programs are the enhancers, gene-distal cis -regulatory sequences that govern spatiotemporal quantitative dynamics target genes. Enhancers widely believed physically contact promoters effect transcriptional activation. However, our understanding full complement regulatory proteins definitive mechanics enhancer action is incomplete. Here, we review recent findings present some emerging concepts on also outline a set outstanding questions.

Language: Английский

Citations

271

The relationship between genome structure and function DOI
A. Marieke Oudelaar, Douglas R. Higgs

Nature Reviews Genetics, Journal Year: 2020, Volume and Issue: 22(3), P. 154 - 168

Published: Nov. 24, 2020

Language: Английский

Citations

253

Integrated intra‐ and intercellular signaling knowledge for multicellular omics analysis DOI Creative Commons
Dénes Türei, Alberto Valdeolivas,

Lejla Gul

et al.

Molecular Systems Biology, Journal Year: 2021, Volume and Issue: 17(3)

Published: March 1, 2021

Molecular knowledge of biological processes is a cornerstone in omics data analysis. Applied to single-cell data, such analyses provide mechanistic insights into individual cells and their interactions. However, intercellular communication scarce, scattered across resources, not linked intracellular processes. To address this gap, we combined over 100 resources covering interactions roles proteins inter- signaling, as well transcriptional post-transcriptional regulation. We added protein complex information annotations on function, localization, role diseases for each protein. The resource available human, via homology translation mouse rat. are accessible OmniPath's web service (https://omnipathdb.org/), Cytoscape plug-in, packages R/Bioconductor Python, providing access options computational experimental scientists. created workflows with tutorials facilitate the analysis cell-cell affected downstream signaling OmniPath provides single point spanning intra- analysis, demonstrate applications studying SARS-CoV-2 infection ulcerative colitis.

Language: Английский

Citations

248

Cis-regulatory sequences in plants: Their importance, discovery, and future challenges DOI Creative Commons
Robert J. Schmitz, Erich Grotewold, Maike Stam

et al.

The Plant Cell, Journal Year: 2021, Volume and Issue: 34(2), P. 718 - 741

Published: Nov. 18, 2021

The identification and characterization of cis-regulatory DNA sequences how they function to coordinate responses developmental environmental cues is paramount importance plant biology. Key these regulatory processes are modules (CRMs), which include enhancers silencers. Despite the extraordinary advances in high-quality sequence assemblies genome annotations, understanding CRMs, regulate gene expression, lag significantly behind. This especially true for their distinguishing characteristics activity states. Here, we review current knowledge on CRMs breakthrough technologies enabling identification, characterization, validation CRMs; compare genomic distributions with respect target genes between different species, discuss role transposable elements harboring evolution expression. an exciting time study cis-regulomes plants; however, significant existing challenges need be overcome fully understand appreciate biology crop improvement.

Language: Английский

Citations

237

Molecular and evolutionary processes generating variation in gene expression DOI
Mark S. Hill, Pétra Vande Zande, Patricia J. Wittkopp

et al.

Nature Reviews Genetics, Journal Year: 2020, Volume and Issue: 22(4), P. 203 - 215

Published: Dec. 2, 2020

Language: Английский

Citations

227