
bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown
Published: Nov. 15, 2024
Abstract Three-dimensional (3D) genome organization plays a critical role in gene expression regulation and function. Recent advances Hi-C Micro-C data across various species provide insights into the mechanisms governing 3D formation, such as loop extrusion. While visual patterns like topologically associating domains (TADs) loops are conserved species, underlying biological may differ. Both species-specific architectural factors DNA sequences influence chromatin folding, complicating comparative studies on evolution of genome. This work leverages existing machine learning to explore folding predict structures from sequences. Here, we present Chimaera (convolutional neural network for maps prediction using autoencoder representation), that not only predicts sequence, but also enables search, quantification, interpretation associations between patterns. Firstly, demonstrate or contact sequences, enabling extraction key mechanisms. By exploring latent representations generated by Chimaera, offer tool building an unsupervised atlas features insulation, loops, stripes, fountains/jets. We capabilities detecting quantifying signatures insulation fountains data, applying it well-characterized processes cell cycle embryogenesis. Additionally, perform targeted search sequence elements associated with specific structures, advancing our understanding organization. extending multiple confirm CTCF generating vertebrates BEAF-32 Drosophila , identify motifs previously reported mouse . In Dictyostelium demonstrates importance arrangement strand formation confirming hypothesis about impact convergent positioning this amoeba. A pronounced diverse effect genes is evident when predicting interactions other organisms. Finally, train model one then apply cross-predict how genomes organisms might fold within cellular environment original species. thereby test whether transferable reveal evolutionary similarities chromatin-based cluster tree ranging plants mammals. Key takeaways genomic interaction Chimaera’s unique architecture integrates encoder, allowing recurring their offers unified metric 3D-genome features, exploration validation complex hypotheses regarding principles including sequence-specific at protein binding sites genes. tasks biology demonstrating its ability consequences rearrangements, mutations, insertions.
Language: Английский