Biophysical Journal, Год журнала: 2024, Номер 123(17), С. 2647 - 2657
Опубликована: Янв. 30, 2024
Язык: Английский
Biophysical Journal, Год журнала: 2024, Номер 123(17), С. 2647 - 2657
Опубликована: Янв. 30, 2024
Язык: Английский
bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown
Опубликована: Янв. 18, 2023
Abstract As opposed to scaling-up protein language models (PLMs), we seek improving performance via protein-specific optimization. Although the proportionality between model size and richness of its learned representations is validated, prioritize accessibility pursue a path data-efficient, cost-reduced, knowledge-guided Through over twenty experiments ranging from masking, architecture, pre-training data, derive insights experimentation into building that interprets life, optimally. We present Ankh, first general-purpose PLM trained on Google’s TPU-v4 surpassing state-of-the-art with fewer parameters (<10% for pre-training, <7% inference, <30% embedding dimension). provide representative range structure function benchmarks where Ankh excels. further variant generation analysis High-N One-N input data scales succeeds in learning evolutionary conservation-mutation trends introducing functional diversity while retaining key structural-functional characteristics. dedicate our work promoting research innovation attainable resources.
Язык: Английский
Процитировано
78Discover Artificial Intelligence, Год журнала: 2023, Номер 3(1)
Опубликована: Май 15, 2023
Abstract The demand for automated customer support approaches in customer-centric environments has increased significantly the past few years. Natural Language Processing (NLP) advancement enabled conversational AI to comprehend human language and respond enquiries from customers automatically independent of intervention humans. Customers can now access prompt responses NLP chatbots without interacting with agents. This application been implemented numerous business sectors, including banking, manufacturing, education, law, healthcare, among others. study reviewed earlier studies on automating queries using approaches. Using a systematic review methodology, 73 articles were analysed reputable digital resources. evaluated result offers an in-depth prior investigating use techniques service responses, details existing studies, benefits, potential future topics applications. implications results discussed and, recommendations made.
Язык: Английский
Процитировано
59Nature Methods, Год журнала: 2024, Номер 21(8), С. 1444 - 1453
Опубликована: Авг. 1, 2024
Язык: Английский
Процитировано
28Artificial Intelligence in Medicine, Год журнала: 2024, Номер 154, С. 102900 - 102900
Опубликована: Июнь 5, 2024
With Artificial Intelligence (AI) increasingly permeating various aspects of society, including healthcare, the adoption Transformers neural network architecture is rapidly changing many applications. Transformer a type deep learning initially developed to solve general-purpose Natural Language Processing (NLP) tasks and has subsequently been adapted in fields, healthcare. In this survey paper, we provide an overview how adopted analyze forms healthcare data, clinical NLP, medical imaging, structured Electronic Health Records (EHR), social media, bio-physiological signals, biomolecular sequences. Furthermore, which have also include articles that used transformer for generating surgical instructions predicting adverse outcomes after surgeries under umbrella critical care. Under diverse settings, these models diagnosis, report generation, data reconstruction, drug/protein synthesis. Finally, discuss benefits limitations using transformers examine issues such as computational cost, model interpretability, fairness, alignment with human values, ethical implications, environmental impact.
Язык: Английский
Процитировано
19Frontiers in Bioengineering and Biotechnology, Год журнала: 2025, Номер 13
Опубликована: Янв. 21, 2025
Protein function prediction is crucial in several key areas such as bioinformatics and drug design. With the rapid progress of deep learning technology, applying protein language models has become a research focus. These utilize increasing amount large-scale sequence data to deeply mine its intrinsic semantic information, which can effectively improve accuracy prediction. This review comprehensively combines current status latest It provides an exhaustive performance comparison with traditional methods. Through in-depth analysis experimental results, significant advantages enhancing depth tasks are fully demonstrated.
Язык: Английский
Процитировано
3Briefings in Bioinformatics, Год журнала: 2021, Номер 22(6)
Опубликована: Май 5, 2021
Abstract As the best substitute for antibiotics, antimicrobial peptides (AMPs) have important research significance. Due to high cost and difficulty of experimental methods identifying AMPs, more researches are focused on using computational solve this problem. Most existing calculation can identify AMPs through sequence itself, but there is still room improvement in recognition accuracy, a problem that constructed model cannot be universal each dataset. The pre-training strategy has been applied many tasks natural language processing (NLP) achieved gratifying results. It also great application prospects field AMP prediction. In paper, we apply training classifiers propose novel algorithm. Our based BERT model, pre-trained with protein data from UniProt, then fine-tuned evaluated six datasets large differences. superior achieves goal accurate identification small sample size. We try different word segmentation peptide chains prove influence steps balancing effect. find number diverse data, followed by fine-tuning new beneficial capturing both data’s specific features common between sequences. Finally, construct dataset, which train general model.
Язык: Английский
Процитировано
77Antibiotics, Год журнала: 2022, Номер 11(10), С. 1451 - 1451
Опубликована: Окт. 21, 2022
Antimicrobial resistance has become a critical global health problem due to the abuse of conventional antibiotics and rise multi-drug-resistant microbes. peptides (AMPs) are group natural that show promise as next-generation their low toxicity host, broad spectrum biological activity, including antibacterial, antifungal, antiviral, anti-parasitic activities, great therapeutic potential, such anticancer, anti-inflammatory, etc. Most importantly, AMPs kill bacteria by damaging cell membranes using multiple mechanisms action rather than targeting single molecule or pathway, making it difficult for bacterial drug develop. However, experimental approaches used discover design new very expensive time-consuming. In recent years, there been considerable interest in silico methods, traditional machine learning (ML) deep (DL) approaches, discovery. While few papers summarizing computational AMP prediction none them focused on DL methods. this review, we aim survey latest methods achieved approaches. First, biology background is introduced, then various feature encoding represent features peptide sequences presented. We explain most popular techniques highlight works based classify novel sequences. Finally, discuss limitations challenges prediction.
Язык: Английский
Процитировано
71Applied Intelligence, Год журнала: 2022, Номер 53(9), С. 10602 - 10635
Опубликована: Авг. 20, 2022
Язык: Английский
Процитировано
70BMC Bioinformatics, Год журнала: 2022, Номер 23(1)
Опубликована: Авг. 8, 2022
Despite the immense importance of transmembrane proteins (TMP) for molecular biology and medicine, experimental 3D structures TMPs remain about 4-5 times underrepresented compared to non-TMPs. Today's top methods such as AlphaFold2 accurately predict many TMPs, but annotating regions remains a limiting step proteome-wide predictions.
Язык: Английский
Процитировано
57Trends in Biochemical Sciences, Год журнала: 2022, Номер 48(4), С. 345 - 359
Опубликована: Дек. 9, 2022
Breakthrough methods in machine learning (ML), protein structure prediction, and novel ultrafast structural aligners are revolutionizing biology. Obtaining accurate models of proteins annotating their functions on a large scale is no longer limited by time resources. The most recent method to be top ranked the Critical Assessment Structure Prediction (CASP) assessment, AlphaFold 2 (AF2), capable building with an accuracy comparable that experimental structures. Annotations 3D keeping pace deposition structures due advancements language (pLMs) help validate these transferred annotations. In this review we describe how developments ML for science making large-scale bioinformatics available general scientific community.
Язык: Английский
Процитировано
49