Machine learning in RNA structure prediction: Advances and challenges DOI
Sicheng Zhang, Jun Li, Shi‐Jie Chen

и другие.

Biophysical Journal, Год журнала: 2024, Номер 123(17), С. 2647 - 2657

Опубликована: Янв. 30, 2024

Язык: Английский

Ankh ☥: Optimized Protein Language Model Unlocks General-Purpose Modelling DOI Creative Commons
Ahmed Elnaggar,

Hazem Essam,

Wafaa Salah-Eldin

и другие.

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Янв. 18, 2023

Abstract As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization. Although the proportionality between model size and richness of its learned representations is validated, prioritize accessibility pursue a path data-efficient, cost-reduced, knowledge-guided Through over twenty experiments ranging from masking, architecture, pre-training data, derive insights experimentation into building that interprets life, optimally. We present Ankh, first general-purpose PLM trained on Google’s TPU-v4 surpassing state-of-the-art with fewer parameters (<10% for pre-training, <7% inference, <30% embedding dimension). provide representative range structure function benchmarks where Ankh excels. further variant generation analysis High-N One-N input data scales succeeds in learning evolutionary conservation-mutation trends introducing functional diversity while retaining key structural-functional characteristics. dedicate our work promoting research innovation attainable resources.

Язык: Английский

Процитировано

78

NLP techniques for automating responses to customer queries: a systematic review DOI Creative Commons
Peter Adebowale Olujimi, Abejide Ade-Ibijola

Discover Artificial Intelligence, Год журнала: 2023, Номер 3(1)

Опубликована: Май 15, 2023

Abstract The demand for automated customer support approaches in customer-centric environments has increased significantly the past few years. Natural Language Processing (NLP) advancement enabled conversational AI to comprehend human language and respond enquiries from customers automatically independent of intervention humans. Customers can now access prompt responses NLP chatbots without interacting with agents. This application been implemented numerous business sectors, including banking, manufacturing, education, law, healthcare, among others. study reviewed earlier studies on automating queries using approaches. Using a systematic review methodology, 73 articles were analysed reputable digital resources. evaluated result offers an in-depth prior investigating use techniques service responses, details existing studies, benefits, potential future topics applications. implications results discussed and, recommendations made.

Язык: Английский

Процитировано

59

Guiding questions to avoid data leakage in biological machine learning applications DOI
Judith Bernett, David B. Blumenthal, Dominik G. Grimm

и другие.

Nature Methods, Год журнала: 2024, Номер 21(8), С. 1444 - 1453

Опубликована: Авг. 1, 2024

Язык: Английский

Процитировано

28

Transformers and large language models in healthcare: A review DOI Creative Commons
Subhash Nerella, Sabyasachi Bandyopadhyay, Jiaqing Zhang

и другие.

Artificial Intelligence in Medicine, Год журнала: 2024, Номер 154, С. 102900 - 102900

Опубликована: Июнь 5, 2024

With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption Transformers neural network architecture is rapidly changing many applications. Transformer a type deep learning initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in fields, healthcare. In this survey paper, we provide an overview how adopted analyze forms healthcare data, clinical NLP, medical imaging, structured Electronic Health Records (EHR), social media, bio-physiological signals, biomolecular sequences. Furthermore, which have also include articles that used transformer for generating surgical instructions predicting adverse outcomes after surgeries under umbrella critical care. Under diverse settings, these models diagnosis, report generation, data reconstruction, drug/protein synthesis. Finally, discuss benefits limitations using transformers examine issues such as computational cost, model interpretability, fairness, alignment with human values, ethical implications, environmental impact.

Язык: Английский

Процитировано

19

Evaluating the advancements in protein language models for encoding strategies in protein function prediction: a comprehensive review DOI Creative Commons
Jiaying Chen, Jingfu Wang, Yue Hu

и другие.

Frontiers in Bioengineering and Biotechnology, Год журнала: 2025, Номер 13

Опубликована: Янв. 21, 2025

Protein function prediction is crucial in several key areas such as bioinformatics and drug design. With the rapid progress of deep learning technology, applying protein language models has become a research focus. These utilize increasing amount large-scale sequence data to deeply mine its intrinsic semantic information, which can effectively improve accuracy prediction. This review comprehensively combines current status latest It provides an exhaustive performance comparison with traditional methods. Through in-depth analysis experimental results, significant advantages enhancing depth tasks are fully demonstrated.

Язык: Английский

Процитировано

3

A novel antibacterial peptide recognition algorithm based on BERT DOI
Yue Zhang,

Jianyuan Lin,

L.M. Zhao

и другие.

Briefings in Bioinformatics, Год журнала: 2021, Номер 22(6)

Опубликована: Май 5, 2021

Abstract As the best substitute for antibiotics, antimicrobial peptides (AMPs) have important research significance. Due to high cost and difficulty of experimental methods identifying AMPs, more researches are focused on using computational solve this problem. Most existing calculation can identify AMPs through sequence itself, but there is still room improvement in recognition accuracy, a problem that constructed model cannot be universal each dataset. The pre-training strategy has been applied many tasks natural language processing (NLP) achieved gratifying results. It also great application prospects field AMP prediction. In paper, we apply training classifiers propose novel algorithm. Our based BERT model, pre-trained with protein data from UniProt, then fine-tuned evaluated six datasets large differences. superior achieves goal accurate identification small sample size. We try different word segmentation peptide chains prove influence steps balancing effect. find number diverse data, followed by fine-tuning new beneficial capturing both data’s specific features common between sequences. Finally, construct dataset, which train general model.

Язык: Английский

Процитировано

77

Recent Progress in the Discovery and Design of Antimicrobial Peptides Using Traditional Machine Learning and Deep Learning DOI Creative Commons
Jielu Yan, Jianxiu Cai, Bob Zhang

и другие.

Antibiotics, Год журнала: 2022, Номер 11(10), С. 1451 - 1451

Опубликована: Окт. 21, 2022

Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and rise multi-drug-resistant microbes. peptides (AMPs) are group natural that show promise as next-generation their low toxicity host, broad spectrum biological activity, including antibacterial, antifungal, antiviral, anti-parasitic activities, great therapeutic potential, such anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms action rather than targeting single molecule or pathway, making it difficult for bacterial drug develop. However, experimental approaches used discover design new very expensive time-consuming. In recent years, there been considerable interest in silico methods, traditional machine learning (ML) deep (DL) approaches, discovery. While few papers summarizing computational AMP prediction none them focused on DL methods. this review, we aim survey latest methods achieved approaches. First, biology background is introduced, then various feature encoding represent features peptide sequences presented. We explain most popular techniques highlight works based classify novel sequences. Finally, discuss limitations challenges prediction.

Язык: Английский

Процитировано

71

Transformer models used for text-based question answering systems DOI
Khalid Nassiri, Moulay A. Akhloufi

Applied Intelligence, Год журнала: 2022, Номер 53(9), С. 10602 - 10635

Опубликована: Авг. 20, 2022

Язык: Английский

Процитировано

70

TMbed: transmembrane proteins predicted through language model embeddings DOI Creative Commons
Michael Bernhofer, Burkhard Rost

BMC Bioinformatics, Год журнала: 2022, Номер 23(1)

Опубликована: Авг. 8, 2022

Despite the immense importance of transmembrane proteins (TMP) for molecular biology and medicine, experimental 3D structures TMPs remain about 4-5 times underrepresented compared to non-TMPs. Today's top methods such as AlphaFold2 accurately predict many TMPs, but annotating regions remains a limiting step proteome-wide predictions.

Язык: Английский

Процитировано

57

Novel machine learning approaches revolutionize protein knowledge DOI Creative Commons
Nicola Bordin, Christian Dallago, Michael Heinzinger

и другие.

Trends in Biochemical Sciences, Год журнала: 2022, Номер 48(4), С. 345 - 359

Опубликована: Дек. 9, 2022

Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing biology. Obtaining accurate models of proteins annotating their functions on a large scale is no longer limited by time resources. The most recent method to be top ranked the Critical Assessment Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), capable building with an accuracy comparable that experimental structures. Annotations 3D keeping pace deposition structures due advancements language (pLMs) help validate these transferred annotations. In this review we describe how developments ML for science making large-scale bioinformatics available general scientific community.

Язык: Английский

Процитировано

49