Recent Developments in Ultralarge and Structure-Based Virtual Screening Approaches DOI Creative Commons
Christoph Gorgulla

Annual Review of Biomedical Data Science, Journal Year: 2023, Volume and Issue: 6(1), P. 229 - 258

Published: May 23, 2023

Drug development is a wide scientific field that faces many challenges these days. Among them are extremely high costs, long times, and small number of new drugs approved each year. New innovative technologies needed to solve problems make the drug discovery process molecules more time cost efficient, allow previously undruggable receptor classes be targeted, such as protein–protein interactions. Structure-based virtual screenings (SBVSs) have become leading contender in this context. In review, we give an introduction foundations SBVSs survey their progress past few years with focus on ultralarge (ULVSs). We outline key principles SBVSs, recent success stories, screening techniques, available deep learning–based docking methods, promising future research directions. ULVSs enormous potential for small-molecule already starting transform early-stage discovery.

Language: Английский

AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences DOI Creative Commons
Mihály Váradi,

Damian Bertoni,

Paulyna Magaña

et al.

Nucleic Acids Research, Journal Year: 2023, Volume and Issue: 52(D1), P. D368 - D375

Published: Nov. 2, 2023

The AlphaFold Database Protein Structure (AlphaFold DB, https://alphafold.ebi.ac.uk) has significantly impacted structural biology by amassing over 214 million predicted protein structures, expanding from the initial 300k structures released in 2021. Enabled groundbreaking AlphaFold2 artificial intelligence (AI) system, predictions archived DB have been integrated into primary data resources such as PDB, UniProt, Ensembl, InterPro and MobiDB. Our manuscript details subsequent enhancements archiving, covering successive releases encompassing model organisms, global health proteomes, Swiss-Prot integration, a host of curated datasets. We detail access mechanisms direct file via FTP to advanced queries using Google Cloud Public Datasets programmatic endpoints database. also discuss improvements services added since its release, including Predicted Aligned Error viewer, customisation options for 3D search engine DB.

Language: Английский

Citations

609

AlphaFold2 and its applications in the fields of biology and medicine DOI Creative Commons
Zhenyu Yang, Xiaoxi Zeng, Yi Zhao

et al.

Signal Transduction and Targeted Therapy, Journal Year: 2023, Volume and Issue: 8(1)

Published: March 14, 2023

Abstract AlphaFold2 (AF2) is an artificial intelligence (AI) system developed by DeepMind that can predict three-dimensional (3D) structures of proteins from amino acid sequences with atomic-level accuracy. Protein structure prediction one the most challenging problems in computational biology and chemistry, has puzzled scientists for 50 years. The advent AF2 presents unprecedented progress protein attracted much attention. Subsequent release more than 200 million predicted further aroused great enthusiasm science community, especially fields medicine. thought to have a significant impact on structural research areas need information, such as drug discovery, design, function, et al. Though time not long since was developed, there are already quite few application studies medicine, many them having preliminarily proved potential AF2. To better understand promote its applications, we will this article summarize principle architecture well recipe success, particularly focus reviewing applications Limitations current also be discussed.

Language: Английский

Citations

267

Before and after AlphaFold2: An overview of protein structure prediction DOI Creative Commons
Letícia M. F. Bertoline,

Angélica N. Lima,

José Eduardo Krieger

et al.

Frontiers in Bioinformatics, Journal Year: 2023, Volume and Issue: 3

Published: Feb. 28, 2023

Three-dimensional protein structure is directly correlated with its function and determination critical to understanding biological processes addressing human health life science problems in general. Although new structures are experimentally obtained over time, there still a large difference between the number of sequences placed Uniprot those resolved tertiary structure. In this context, studies have emerged predict by methods based on template or free modeling. last years, different been combined overcome their individual limitations, until emergence AlphaFold2, which demonstrated that predicting high accuracy at unprecedented scale possible. Despite current impact field, AlphaFold2 has limitations. Recently, language models promised revolutionize structural biology allowing discovery only from evolutionary patterns present sequence. Even though these do not reach accuracy, they already covered some being able more than 200 million proteins metagenomic databases. mini-review, we provide an overview breakthroughs prediction before after emergence.

Language: Английский

Citations

157

Prediction of effector protein structures from fungal phytopathogens enables evolutionary analyses DOI Creative Commons
Kyungyong Seong, Ksenia V. Krasileva

Nature Microbiology, Journal Year: 2023, Volume and Issue: 8(1), P. 174 - 187

Published: Jan. 5, 2023

Elucidating the similarity and diversity of pathogen effectors is critical to understand their evolution across fungal phytopathogens. However, rapid divergence that diminishes sequence similarities between putatively homologous has largely concealed roots effector evolution. Here we modelled structures 26,653 secreted proteins from 14 agriculturally important phytopathogens, six non-pathogenic fungi one oomycete with AlphaFold 2. With 18,000 successfully predicted folds, performed structure-guided comparative analyses on two aspects evolution: uniquely expanded sequence-unrelated structurally similar (SUSS) families common folds present species. Extreme expansion lineage-specific SUSS was found only in several obligate biotrophs, Blumeria graminis Puccinia graminis. The highly were source conserved motifs, such as Y/F/WxC motif. We identified new classes include known virulence factors, AvrSr35, AvrSr50 Tin2. Structural comparisons revealed structural further diversify through domain duplications fusion disordered stretches. Putatively sub- neo-functionalized could reconverge regulation, expanding functional pools infection cycle. also evidence many have originated ancestral fungi. Collectively, our study highlights diverse mechanisms supports divergent a major force driving proteins.

Language: Английский

Citations

128

BepiPred‐3.0: Improved B‐cell epitope prediction using protein language models DOI Creative Commons
Joakim Nøddeskov Clifford, Magnus Haraldson Høie, Sebastian Deleuran

et al.

Protein Science, Journal Year: 2022, Volume and Issue: 31(12)

Published: Nov. 11, 2022

B-cell epitope prediction tools are of great medical and commercial interest due to their practical applications in vaccine development disease diagnostics. The introduction protein language models (LMs), trained on unprecedented large datasets sequences structures, tap into a powerful numeric representation that can be exploited accurately predict local global structural features from amino acid only. In this paper, we present BepiPred-3.0, sequence-based tool that, by exploiting LM embeddings, greatly improves the accuracy for both linear conformational several independent test sets. Furthermore, carefully selecting additional input variables residue annotation strategy, performance was further improved, thus achieving predictive power. Our epitopes across hundreds minutes. It is freely available as web server standalone package at https://services.healthtech.dtu.dk/service.php?BepiPred-3.0 with user-friendly interface navigate results.

Language: Английский

Citations

124

Computational and artificial intelligence-based methods for antibody development DOI Creative Commons
Ji‐Sun Kim, Matthew McFee,

Qiao Fang

et al.

Trends in Pharmacological Sciences, Journal Year: 2023, Volume and Issue: 44(3), P. 175 - 189

Published: Jan. 18, 2023

Due to their high target specificity and binding affinity, therapeutic antibodies are currently the largest class of biotherapeutics. The traditional largely empirical antibody development process is, while mature robust, cumbersome has significant limitations. Substantial recent advances in computational artificial intelligence (AI) technologies now starting overcome many these limitations increasingly integrated into pipelines. Here, we provide an overview AI methods relevant for development, including databases, predictors properties structure, design with emphasis on machine learning (ML) models, complementarity-determining region (CDR) loops, structural components critical binding.

Language: Английский

Citations

96

Metagenomics: An Effective Approach for Exploring Microbial Diversity and Functions DOI Creative Commons
Nguyễn Nhật Nam, Hoang Dang Khoa, Kieu The Loan Trinh

et al.

Foods, Journal Year: 2023, Volume and Issue: 12(11), P. 2140 - 2140

Published: May 25, 2023

Various fields have been identified in the "omics" era, such as genomics, proteomics, transcriptomics, metabolomics, phenomics, and metagenomics. Among these, metagenomics has enabled a significant increase discoveries related to microbial world. Newly discovered microbiomes different ecologies provide meaningful information on diversity functions of microorganisms Earth. Therefore, results metagenomic studies new microbe-based applications human health, agriculture, food industry, among others. This review summarizes fundamental procedures recent advances bioinformatic tools. It also explores up-to-date study, plant research, environmental sciences, other fields. Finally, is powerful tool for studying world, it still numerous that are currently hidden awaiting discovery. this discusses future perspectives

Language: Английский

Citations

71

Benchmarking Refined and Unrefined AlphaFold2 Structures for Hit Discovery DOI
Yuqi Zhang, Márton Vass, Da Shi

et al.

Journal of Chemical Information and Modeling, Journal Year: 2023, Volume and Issue: 63(6), P. 1656 - 1667

Published: March 10, 2023

The recently developed AlphaFold2 (AF2) algorithm predicts proteins’ 3D structures from amino acid sequences. open AlphaFold protein structure database covers the complete human proteome. Using an industry-leading molecular docking method (Glide), we investigated virtual screening performance of 37 common drug targets, each with AF2 and known holo apo DUD-E data set. In a subset 27 targets where are suitable for refinement, show comparable early enrichment active compounds (avg. EF 1%: 13.0) to 11.4) while falling behind 24.2). With induced-fit protocol (IFD-MD), can refine using aligned binding ligand as template improve in structure-based 18.9). Glide-generated poses ligands also be used templates IFD-MD, achieving similar improvements 1% 18.0). Thus, proper preparation considerable promise silico hit identification.

Language: Английский

Citations

69

xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein DOI Creative Commons
Bo Chen,

Xingyi Cheng,

Li Pan

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2023, Volume and Issue: unknown

Published: July 6, 2023

Protein language models have shown remarkable success in learning biological information from protein sequences. However, most existing are limited by either autoencoding or autoregressive pre-training objectives, which makes them struggle to handle understanding and generation tasks concurrently. We propose a unified model, xTrimoPGLM, address these two types of simultaneously through an innovative framework. Our key technical contribution is exploration the compatibility potential for joint optimization has led strategy training xTrimoPGLM at unprecedented scale 100 billion parameters 1 trillion tokens. extensive experiments reveal that 1) significantly outperforms other advanced baselines 18 benchmarks across four categories. The model also facilitates atomic-resolution view structures, leading 3D structural prediction surpasses model-based tools. 2) not only can generate de novo sequences following principles natural ones, but perform programmable after supervised fine-tuning (SFT) on curated These results highlight substantial capability versatility generating sequences, contributing evolving landscape foundation science.

Language: Английский

Citations

60

Biasing AlphaFold2 to predict GPCRs and kinases with user-defined functional or structural properties DOI Creative Commons

Davide Sala,

Peter W. Hildebrand, Jens Meiler

et al.

Frontiers in Molecular Biosciences, Journal Year: 2023, Volume and Issue: 10

Published: Feb. 16, 2023

Determining the three-dimensional structure of proteins in their native functional states has been a longstanding challenge structural biology. While integrative biology most effective way to get high-accuracy different conformations and mechanistic insights for larger proteins, advances deep machine-learning algorithms have paved fully computational predictions. In this field, AlphaFold2 (AF2) pioneered

Language: Английский

Citations

56