pyRBDome: A comprehensive computational platform for enhancing and interpreting RNA-binding proteome data DOI Creative Commons

Liang‐Cui Chu,

Niki Christopoulou, Hugh McCaughan

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Дек. 8, 2023

Abstract High-throughput proteomics approaches have revolutionised the identification of RNA-binding proteins (RBPome) and sequences (RBDome) across organisms. Yet extent noise, including false-positives, associated with these methodologies, is difficult to quantify as experimental for validating results are generally low throughput. To address this, we introduce pyRBDome, a pipeline enhancing proteome data in silico . It aligns site (RBS) predictions from distinct machine learning tools integrates high-resolution structural when available. Its statistical evaluation RBDome enables quick likely genuine RNA-binders datasets. Furthermore, by leveraging pyRBDome results, enhanced sensitivity specificity RBS detection through training new ensemble models. analysis human dataset, compared known data, revealed that while UV cross-linked amino acids were more contain predicted RBSs, they infrequently bind RNA structures. This discrepancy underscores limitations benchmarks, positioning valuable alternative increasing confidence

Язык: Английский

HybridDBRpred: improved sequence-based prediction of DNA-binding amino acids using annotations from structured complexes and disordered proteins DOI Creative Commons
Jian Zhang, Sushmita Basu, Lukasz Kurgan

и другие.

Nucleic Acids Research, Год журнала: 2023, Номер 52(2), С. e10 - e10

Опубликована: Дек. 4, 2023

Current predictors of DNA-binding residues (DBRs) from protein sequences belong to two distinct groups, those trained on binding annotations extracted structured protein-DNA complexes (structure-trained) vs. intrinsically disordered proteins (disorder-trained). We complete the first empirical analysis predictive performance across structure- and disorder-annotated for a representative collection ten predictors. Majority structure-trained tools perform well structure-annotated while doing relatively poorly proteins, vice versa. Several methods make accurate predictions or but none performs highly accurately both annotation types. Moreover, most excessive cross-predictions where that interact with non-DNA ligand types are predicted as DBRs. Motivated by these results, we design, validate deploy an innovative meta-model, hybridDBRpred, uses deep transformer network combine generated three best current HybridDBRpred provides low levels types, is statistically more than each baseline meta-predictors rely averaging logistic regression. hybridDBRpred convenient web server at http://biomine.cs.vcu.edu/servers/hybridDBRpred/ provide corresponding source code https://github.com/jianzhang-xynu/hybridDBRpred.

Язык: Английский

Процитировано

16

Improving prediction performance of general protein language model by domain-adaptive pretraining on DNA-binding protein DOI Creative Commons
Wenwu Zeng, Yutao Dou, Liangrui Pan

и другие.

Nature Communications, Год журнала: 2024, Номер 15(1)

Опубликована: Сен. 7, 2024

Язык: Английский

Процитировано

6

Research progress on prediction of RNA-protein binding sites in the past five years DOI
Yun Zuo, Huixian Chen, Lele Yang

и другие.

Analytical Biochemistry, Год журнала: 2024, Номер 691, С. 115535 - 115535

Опубликована: Апрель 20, 2024

Язык: Английский

Процитировано

4

Advances in artificial intelligence-envisioned technologies for protein and nucleic acid research DOI Creative Commons
Amol D. Gholap, Abdelwahab Omri

Drug Discovery Today, Год журнала: 2025, Номер unknown, С. 104362 - 104362

Опубликована: Апрель 1, 2025

Artificial intelligence (AI) and machine learning (ML) have revolutionized pharmaceutical research, particularly in protein nucleic acid studies. This review summarizes the current status of AI ML applications sector, focusing on innovative tools, web servers, databases. paper highlights how these technologies address key challenges drug development including high costs, lengthy timelines, complexity biological systems. Furthermore, potential personalized medicine, cancer response prediction, biomarker identification is discussed. The integration research promises to accelerate discovery, reduce ultimately lead more effective therapeutic strategies.

Язык: Английский

Процитировано

0

pyRBDome: a comprehensive computational platform for enhancing RNA-binding proteome data DOI Creative Commons

Liang‐Cui Chu,

Niki Christopoulou, Hugh McCaughan

и другие.

Life Science Alliance, Год журнала: 2024, Номер 7(10), С. e202402787 - e202402787

Опубликована: Июль 30, 2024

High-throughput proteomics approaches have revolutionised the identification of RNA-binding proteins (RBPome) and sequences (RBDome) across organisms. Yet, extent noise, including false positives, associated with these methodologies, is difficult to quantify as experimental for validating results are generally low throughput. To address this, we introduce pyRBDome, a pipeline enhancing proteome data in silico. It aligns site (RBS) predictions from distinct machine-learning tools integrates high-resolution structural when available. Its statistical evaluation RBDome enables quick likely genuine RNA-binders datasets. Furthermore, by leveraging pyRBDome results, enhanced sensitivity specificity RBS detection through training new ensemble models. analysis human dataset, compared known data, revealed that although UV–cross-linked amino acids were more contain predicted RBSs, they infrequently bind RNA structures. This discrepancy underscores limitations benchmarks, positioning valuable alternative increasing confidence

Язык: Английский

Процитировано

2

Recent Advances in Computational Prediction of Secondary and Supersecondary Structures from Protein Sequences DOI
Jian Zhang, Jingjing Qian, Quan Zou

и другие.

Methods in molecular biology, Год журнала: 2024, Номер unknown, С. 1 - 19

Опубликована: Ноя. 14, 2024

Язык: Английский

Процитировано

2

CACO: A Core-Attachment Method With Cross-Species Functional Ortholog Information to Detect Human Protein Complexes DOI
Wenkang Wang, Xiangmao Meng, Ju Xiang

и другие.

IEEE Journal of Biomedical and Health Informatics, Год журнала: 2023, Номер 27(9), С. 4569 - 4578

Опубликована: Июль 3, 2023

Protein complexes play an essential role in living cells. Detecting protein is crucial to understand functions and treat complex diseases. Due high time resource consumption of experiment approaches, many computational approaches have been proposed detect complexes. However, most them are only based on protein-protein interaction (PPI) networks, which heavily suffer from the noise PPI networks. Therefore, we propose a novel core-attachment method, named CACO, human complexes, by integrating functional information other species via ortholog relations. First, CACO constructs cross-species relation matrix transfers GO terms as reference evaluate confidence PPIs. Then, filter strategy adopted clean network thus weighted constructed. Finally, new effective algorithm network. Compared thirteen state-of-the-art methods, outperforms all F-measure Composite Score, showing that detecting

Язык: Английский

Процитировано

4

Protein–protein and protein–nucleic acid binding site prediction via interpretable hierarchical geometric deep learning DOI Creative Commons
Shizhuo Zhang, Jiyun Han, Juntao Liu

и другие.

GigaScience, Год журнала: 2024, Номер 13

Опубликована: Янв. 1, 2024

Identification of protein-protein and protein-nucleic acid binding sites provides insights into biological processes related to protein functions technical guidance for disease diagnosis drug design. However, accurate predictions by computational approaches remain highly challenging due the limited knowledge residue patterns. The pattern a should be characterized spatial distribution its neighboring residues combined with their physicochemical information interaction, which yet cannot achieved previous methods. Here, we design GraphRBF, hierarchical geometric deep learning model learn patterns from big data. To achieve it, GraphRBF describes interactions designing an enhanced graph neural network characterizes distributions introducing prioritized radial basis function network. After training testing, shows great improvements over existing state-of-the-art methods strong interpretability learned representations. Applying SARS-CoV-2 omicron spike protein, it successfully identifies known epitopes protein. Moreover, predicts multiple potential regions new nanobodies or even drugs evidence. A user-friendly online server is freely available at http://liulab.top/GraphRBF/server.

Язык: Английский

Процитировано

1

MERIT: Accurate prediction of multi ligand-binding residues with hybrid deep transformer network, evolutionary couplings and transfer learning DOI
Jian Zhang, Sushmita Basu, Fuhao Zhang

и другие.

Journal of Molecular Biology, Год журнала: 2024, Номер unknown, С. 168872 - 168872

Опубликована: Ноя. 1, 2024

Язык: Английский

Процитировано

1

A comprehensive review of protein-centric predictors for biomolecular interactions: from proteins to nucleic acids and beyond DOI Creative Commons

Pengzhen Jia,

Fuhao Zhang, Chaojin Wu

и другие.

Briefings in Bioinformatics, Год журнала: 2024, Номер 25(3)

Опубликована: Март 27, 2024

Abstract Proteins interact with diverse ligands to perform a large number of biological functions, such as gene expression and signal transduction. Accurate identification these protein–ligand interactions is crucial the understanding molecular mechanisms development new drugs. However, traditional experiments are time-consuming expensive. With high-throughput technologies, an increasing amount protein data available. In past decades, many computational methods have been developed predict interactions. Here, we review comprehensive set over 160 interaction predictors, which cover protein–protein, protein−nucleic acid, protein−peptide protein−other (nucleotide, heme, ion) We carried out analysis above four types predictors from several significant perspectives, including their inputs, feature profiles, models, availability, etc. The current primarily rely on sequences, especially utilizing evolutionary information. improvement in predictions attributed deep learning methods. Additionally, sequence-based pretrained models structure-based approaches emerging trends.

Язык: Английский

Процитировано

0