MethylQUEEN: A Methylation Encoded DNA Foundation Model DOI Creative Commons
Mingyang Li,

Ruichu Gu,

Shiyu Fan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 26, 2024

Abstract DNA 5-methylcytosine (5mC) modification plays a pivotal role in many biological processes, yet 5mC information and pattern hidden behind remains to be explored. Here, we develop Methyl ation Language Model based on Qu intupl e Bidir ctional Tra n sformer (MethylQUEEN), novel pre-trained methylation foundation model capable of sensing states covering the genome-wide landscape. Through tailored methylation-prone pre-training, MethylQUEEN effectively captured epigenetics within sequences: it accurately traces DNA’s tissue-of-origin, successfully recovers expression profile through states. Integrative analysis MethylQUEEN’s attention scores also enables us reveal unique status tissue for precise disease detection, identifying key regulatory sites intervention. As result, signifies new paradigm various problems. Besides, our study demonstrates effectiveness directly integrating into offering perspectives methodologies range methylation-related processes. It serves as an initial exploration development more comprehensive epigenomic models.

Language: Английский

Deep learning and generative artificial intelligence in aging research and healthy longevity medicine DOI Creative Commons
Dominika Wilczok

Aging, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 16, 2025

With the global population aging at an unprecedented rate, there is a need to extend healthy productive life span. This review examines how Deep Learning (DL) and Generative Artificial Intelligence (GenAI) are used in biomarker discovery, deep clock development, geroprotector identification generation of dual-purpose therapeutics targeting disease. The paper explores emergence multimodal, multitasking research systems highlighting promising future directions for GenAI human animal research, as well clinical application longevity medicine.

Language: Английский

Citations

1

MethylProphet: A Generalized Gene-Contextual Model for Inferring Whole-Genome DNA Methylation Landscape DOI Open Access
Xiaoke Huang, Qi Liu, Yifei Zhao

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 8, 2025

Abstract DNA methylation (DNAm), an epigenetic modification, regulates gene expression, influences phenotypes, and encodes inheritable information, making it critical for disease diagnosis, treatment, prevention. While human genome contains approximately 28 million CpG sites where DNAm can be measured, only 1–3% of these are typically available in most datasets due to complex experimental protocols high costs, hindering insights from data. Leveraging the relationship between expression offers promise computational inference, but existing statistical, machine learning, masking-based generative Transformers face limitations: they cannot infer at unmeasured CpGs or new samples effectively. To overcome challenges, we introduce MethylProphet, a gene-guided, context-aware Transformer model designed inference. MethylProphet employs Bottleneck MLP efficient profile compression specialized sequence tokenizer, integrating global patterns with local context through encoder architecture. Trained on whole-genome bisulfite sequencing data ENCODE (1.6B training CpG-sample pairs; 322B tokens), demonstrates strong performance hold-out evaluations, effectively inferring samples. In addition, its application 10842 pairs TCGA chromosome 1 (450M CpGsample 91B tokens) highlights potential facilitate pan-cancer landscape offering powerful tool advancing research precision medicine. All codes, data, protocols, models publicly via https://github.com/xk-huang/methylprophet/ .

Language: Английский

Citations

0

MethylQUEEN: A Methylation Encoded DNA Foundation Model DOI Creative Commons
Mingyang Li,

Ruichu Gu,

Shiyu Fan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: Dec. 26, 2024

Abstract DNA 5-methylcytosine (5mC) modification plays a pivotal role in many biological processes, yet 5mC information and pattern hidden behind remains to be explored. Here, we develop Methyl ation Language Model based on Qu intupl e Bidir ctional Tra n sformer (MethylQUEEN), novel pre-trained methylation foundation model capable of sensing states covering the genome-wide landscape. Through tailored methylation-prone pre-training, MethylQUEEN effectively captured epigenetics within sequences: it accurately traces DNA’s tissue-of-origin, successfully recovers expression profile through states. Integrative analysis MethylQUEEN’s attention scores also enables us reveal unique status tissue for precise disease detection, identifying key regulatory sites intervention. As result, signifies new paradigm various problems. Besides, our study demonstrates effectiveness directly integrating into offering perspectives methodologies range methylation-related processes. It serves as an initial exploration development more comprehensive epigenomic models.

Language: Английский

Citations

0