Bridging biomolecular modalities for knowledge transfer in bio-language models DOI Creative Commons

Mangal Prakash,

Artem Moskalev,

Peter A. DiMaggio

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Окт. 17, 2024

Abstract In biology, messenger RNA (mRNA) plays a crucial role in gene expression and protein synthesis. Accurate predictive modeling of mRNA properties can greatly enhance our understanding manipulation biological processes, leading to advancements medical biotechnological applications. Utilizing bio-language foundation models allows for leveraging large-scale pretrained knowledge, which significantly improve the efficiency accuracy these predictions. However, specific are notably limited posing challenges efficient mRNA-focused tasks. contrast, DNA modalities have numerous general-purpose trained on billions sequences. This paper explores potential adaptation existing Through experiments using various datasets curated from both public domain internal proprietary database, we demonstrate that pre-trained be effectively transferred tasks techniques such as probing, full-rank, low-rank finetuning. addition, identify key factors influence successful adaptation, offering guidelines when likely perform well We further assess impact model size efficacy, finding medium-scale often outperform larger ones cross-modal knowledge transfer. conclude by interconnectedness DNA, mRNA, proteins, outlined central dogma molecular across modalities, enhancing repertoire computational tools available analysis.

Язык: Английский

Scientific Large Language Models: A Survey on Biological & Chemical Domains DOI Open Access
Qiang Zhang, Keyan Ding, Tingting Lv

и другие.

ACM Computing Surveys, Год журнала: 2025, Номер unknown

Опубликована: Янв. 26, 2025

Large Language Models (LLMs) have emerged as a transformative power in enhancing natural language comprehension, representing significant stride toward artificial general intelligence. The application of LLMs extends beyond conventional linguistic boundaries, encompassing specialized systems developed within various scientific disciplines. This growing interest has led to the advent LLMs, novel subclass specifically engineered for facilitating discovery. As burgeoning area community AI Science, warrant comprehensive exploration. However, systematic and up-to-date survey introducing them is currently lacking. In this paper, we endeavor methodically delineate concept “scientific language”, whilst providing thorough review latest advancements LLMs. Given expansive realm disciplines, our analysis adopts focused lens, concentrating on biological chemical domains. includes an in-depth examination textual knowledge, small molecules, macromolecular proteins, genomic sequences, their combinations, analyzing terms model architectures, capabilities, datasets, evaluation. Finally, critically examine prevailing challenges point out promising research directions along with advances By offering overview technical developments field, aspires be invaluable resource researchers navigating intricate landscape

Язык: Английский

Процитировано

2

Machine learning in RNA structure prediction: Advances and challenges DOI
Sicheng Zhang, Jun Li, Shi‐Jie Chen

и другие.

Biophysical Journal, Год журнала: 2024, Номер 123(17), С. 2647 - 2657

Опубликована: Янв. 30, 2024

Язык: Английский

Процитировано

9

Deep learning for RNA structure prediction DOI Creative Commons
Jiuming Wang,

Yimin Fan,

Liang Hong

и другие.

Current Opinion in Structural Biology, Год журнала: 2025, Номер 91, С. 102991 - 102991

Опубликована: Фев. 10, 2025

Язык: Английский

Процитировано

1

RNA structure prediction using deep learning — A comprehensive review DOI Creative Commons
Mayank Chaturvedi, Mahmood A. Rashid, Kuldip K. Paliwal

и другие.

Computers in Biology and Medicine, Год журнала: 2025, Номер 188, С. 109845 - 109845

Опубликована: Фев. 20, 2025

In computational biology, accurate RNA structure prediction offers several benefits, including facilitating a better understanding of functions and RNA-based drug design. Implementing deep learning techniques for has led tremendous progress in this field, resulting significant improvements accuracy. This comprehensive review aims to provide an overview the diverse strategies employed predicting secondary structures, emphasizing methods. The article categorizes discussion into three main dimensions: feature extraction methods, existing state-of-the-art model architectures, approaches. We present comparative analysis various models highlighting their strengths weaknesses. Finally, we identify gaps literature, discuss current challenges, suggest future approaches enhance performance applicability tasks. provides deeper insight subject paves way further dynamic intersection life sciences artificial intelligence.

Язык: Английский

Процитировано

1

Predicting RNA structures and functions by artificial intelligence DOI
Jun Zhang,

Mei Lang,

Yaoqi Zhou

и другие.

Trends in Genetics, Год журнала: 2023, Номер 40(1), С. 94 - 107

Опубликована: Окт. 26, 2023

Язык: Английский

Процитировано

18

ERNIE-RNA: An RNA Language Model with Structure-enhanced Representations DOI Creative Commons
Weijie Yin, Zhaoyu Zhang, Liang He

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Март 17, 2024

Abstract With large amounts of unlabeled RNA sequences data produced by high-throughput sequencing technologies, pre-trained language models have been developed to estimate semantic space molecules, which facilities the understanding grammar language. However, existing overlook impact structure when modeling space, resulting in incomplete feature extraction and suboptimal performance across various downstream tasks. In this study, we a model named ERNIE-RNA ( E nhanced R eprese n tations with base-pa i ring r e striction for modeling) based on modified BERT (Bidirectional Encoder Representations from Transformers) incorporating base-pairing restriction no MSA (Multiple Sequence Alignment) information. We found that attention maps fine-tuning are able capture zero-shot experiment more precisely than conventional methods such as fine-tuned RNAfold RNAstructure, suggesting can provide comprehensive structural representations. Furthermore, achieved SOTA (state-of-the-art) after tasks, including functional predictions. summary, our provides general features be widely effectively applied subsequent research Our results indicate introducing key knowledge-based prior information framework may useful strategy enhance other models.

Язык: Английский

Процитировано

6

RNA-TorsionBERT: leveraging language models for RNA 3D torsion angles prediction DOI Creative Commons
Clément Bernard, Guillaume Postic, Sahar Ghannay

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2024, Номер unknown

Опубликована: Июнь 9, 2024

Predicting the 3D structure of RNA is an ongoing challenge that has yet to be completely addressed despite continuous advancements. structures rely on distances between residues and base interactions but also backbone torsional angles. Knowing angles for each residue could help reconstruct its global folding, which what we tackle in this work. This paper presents a novel approach directly predicting from raw sequence data. Our method draws inspiration successful application language models various domains adapts them RNA. We have developed language-based model, RNA-TorsionBERT, incorporating better sequential pseudo-torsional only. Through extensive benchmarking, demonstrate our improves prediction compared state-of-the-art methods. In addition, by using predictive inferred torsion angle-dependent scoring function, called RNA-Torsion-A, replaces true reference model prediction. show it accurately evaluates quality near-native predicted structures, terms pseudo-torsion angle values. work demonstrates promising results, suggesting potential utility advancing The source code freely available EvryRNA platform: https://evryrna.ibisc.univ-evry.fr/evryrna/RNA-TorsionBERT .

Язык: Английский

Процитировано

5

Transformers in RNA structure prediction: A review DOI Creative Commons
Mayank Chaturvedi, Mahmood A. Rashid, Kuldip K. Paliwal

и другие.

Computational and Structural Biotechnology Journal, Год журнала: 2025, Номер unknown

Опубликована: Март 1, 2025

The Transformer is a deep neural network based on the self-attention mechanism, designed to handle sequential data. Given its tremendous advantages in natural language processing, it has gained traction for other applications. As primary structure of RNA sequence nucleotides, researchers have applied Transformers predict secondary and tertiary structures from sequences. number Transformer-based models prediction tasks rapidly increasing as they performed par or better than learning networks, such Convolutional Recurrent Neural Networks. This article thoroughly examines models. Through an in-depth analysis models, we aim explain how their architectural innovations improve performances what still lack. techniques continue evolve, this review serves both record past achievements guide future avenues.

Язык: Английский

Процитировано

0

Artificial intelligence-driven plant bio-genomics research: a new era DOI
Yang Lin, Hao Wang, Meiling Zou

и другие.

Tropical Plants, Год журнала: 2025, Номер 4(1), С. 0 - 0

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

0

ATOM-1: A Foundation Model for RNA Structure and Function Built on Chemical Mapping Data DOI Creative Commons
Nicholas Boyd, Brandon Anderson, Brent Townshend

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Дек. 14, 2023

Abstract RNA-based medicines and RNA-targeting drugs are emerging as promising new approaches for treating disease. Optimizing these therapeutics by naive experimental screening is a time-consuming expensive process, while rational design requires an accurate understanding of the structure function RNA. To address this challenge, we present ATOM-1, first RNA foundation model trained on chemical mapping data, enabled data collection strategies purposely developed machine learning training. Using small probe neural networks top ATOM-1 embeddings, demonstrate that has rich internal representations Trained limited amounts additional achieve state-of-the-art accuracy key prediction tasks, suggesting approach can enable therapies across landscape.

Язык: Английский

Процитировано

11