Hyena architecture enables fast and efficient protein language modeling DOI Creative Commons
Y. T. Zhang, Bian Bian,

Manabu Okumura

и другие.

iMetaOmics., Год журнала: 2024, Номер unknown

Опубликована: Дек. 7, 2024

Abstract The emergence of self‐supervised deep language models has revolutionized natural processing tasks and recently extended its applications to biological sequence analysis. Traditional models, primarily based on Transformer architectures, demonstrate substantial effectiveness in various applications. However, these are inherently constrained by the attention mechanism's quadratic computational complexity, , which limits their efficiency leads high costs. To address limitations, we introduce ProtHyena, a novel approach that leverages Hyena operator protein modeling. This innovative methodology alternates between subquadratic long convolutions element‐wise gating operations, circumvents constraints imposed mechanisms reduces complexity levels. enables faster more memory‐efficient modeling sequences. ProtHyena can achieve state‐of‐the‐art results comparable performance 8 downstream tasks, including engineering (protein fluorescence stability prediction), property prediction (neuropeptide cleavage, signal peptide, solubility, disorder, gene function structure prediction, with only 1.6 M parameters. architecture represents highly efficient solution for modeling, offering promising avenue fast analysis

Язык: Английский

Deep Learning and Neural Networks: Decision-Making Implications DOI Open Access
Hamed Taherdoost

Symmetry, Год журнала: 2023, Номер 15(9), С. 1723 - 1723

Опубликована: Сен. 8, 2023

Deep learning techniques have found applications across diverse fields, enhancing the efficiency and effectiveness of decision-making processes. The integration these underscores significance interdisciplinary research. In particular, decisions often rely on output’s projected value or probability from neural networks, considering different values relevant output factor. This review examines impact deep systems, analyzing 25 papers published between 2017 2022. highlights improved accuracy but emphasizes need for addressing issues like interpretability, generalizability, to build reliable decision support systems. Future research directions include transparency, explainability, real-world validation, underscoring importance collaboration successful implementation.

Язык: Английский

Процитировано

41

Expectation management in AI: A framework for understanding stakeholder trust and acceptance of artificial intelligence systems DOI Creative Commons
Marjorie Kinney, Maria Anastasiadou, Mijail Naranjo-Zolotov

и другие.

Heliyon, Год журнала: 2024, Номер 10(7), С. e28562 - e28562

Опубликована: Март 25, 2024

Язык: Английский

Процитировано

14

Leveraging a meta-learning approach to advance the accuracy of Nav blocking peptides prediction DOI Creative Commons
Watshara Shoombuatong, Nutta Homdee, Nalini Schaduangrat

и другие.

Scientific Reports, Год журнала: 2024, Номер 14(1)

Опубликована: Фев. 23, 2024

Abstract The voltage-gated sodium (Na v ) channel is a crucial molecular component responsible for initiating and propagating action potentials. While the α subunit, forming pore, plays central role in this function, complete physiological function of Na channels relies on interactions between subunit auxiliary proteins, known as protein–protein (PPI). blocking peptides (NaBPs) have been recognized promising alternative therapeutic agent pain itch. Although traditional experimental methods can precisely determine effect activity NaBPs, they remain time-consuming costly. Hence, machine learning (ML)-based that are capable accurately contributing silico prediction NaBPs highly desirable. In study, we develop an innovative meta-learning-based NaBP method (MetaNaBP). MetaNaBP generates new feature representations by employing wide range sequence-based descriptors cover multiple perspectives, combination with powerful ML algorithms. Then, these were optimized to identify informative features using two-step selection method. Finally, selected applied final meta-predictor. To best our knowledge, first meta-predictor prediction. Experimental results demonstrated achieved accuracy 0.948 Matthews correlation coefficient 0.898 over independent test dataset, which 5.79% 11.76% higher than existing addition, discriminative power surpassed conventional both training datasets. We anticipate will be exploited large-scale analysis narrow down potential NaBPs.

Язык: Английский

Процитировано

12

The Role of Generative Artificial Intelligence in Digital Agri-Food DOI Creative Commons
Sakib Shahriar, Maria G. Corradini, Shayan Sharif

и другие.

Journal of Agriculture and Food Research, Год журнала: 2025, Номер unknown, С. 101787 - 101787

Опубликована: Март 1, 2025

Язык: Английский

Процитировано

2

Using protein language models for protein interaction hot spot prediction with limited data DOI Creative Commons
Karen Sargsyan, Carmay Lim

BMC Bioinformatics, Год журнала: 2024, Номер 25(1)

Опубликована: Март 16, 2024

Protein language models, inspired by the success of large models in deciphering human language, have emerged as powerful tools for unraveling intricate code life inscribed within protein sequences. They gained significant attention their promising applications across various areas, including sequence-based prediction secondary and tertiary structure, discovery new functional sequences/folds, assessment mutational impact on fitness. However, utility learning to predict residue properties based scant datasets, such protein-protein interaction (PPI)-hotspots whose mutations significantly impair PPIs, remained unclear. Here, we explore feasibility using language-learned representations features machine PPI-hotspots a dataset containing 414 experimentally confirmed 504 PPI-nonhot spots.

Язык: Английский

Процитировано

8

Advancing the accuracy of tyrosinase inhibitory peptides prediction via a multiview feature fusion strategy DOI Creative Commons
Watshara Shoombuatong, Nalini Schaduangrat, Nutta Homdee

и другие.

Scientific Reports, Год журнала: 2025, Номер 15(1)

Опубликована: Фев. 8, 2025

Язык: Английский

Процитировано

1

Advanced machine learning framework for enhancing breast cancer diagnostics through transcriptomic profiling DOI Creative Commons

Mohamed J. Saadh,

Hanan Hassan Ahmed,

Radhwan Abdul Kareem

и другие.

Discover Oncology, Год журнала: 2025, Номер 16(1)

Опубликована: Март 17, 2025

This study proposes an advanced machine learning (ML) framework for breast cancer diagnostics by integrating transcriptomic profiling with optimized feature selection and classification techniques. A dataset of 1759 samples (987 patients, 772 healthy controls) was analyzed using Recursive Feature Elimination, Boruta, ElasticNet selection. Dimensionality reduction techniques, including Non-Negative Matrix Factorization (NMF), Autoencoders, transformer-based embeddings (BioBERT, DNABERT), were applied to enhance model interpretability. Classifiers such as XGBoost, LightGBM, ensemble voting, Multi-Layer Perceptron, Stacking trained grid search cross-validation. Model evaluation conducted accuracy, AUC, MCC, Kappa Score, ROC, PR curves, external validation performed on independent 175 samples. XGBoost LightGBM achieved the highest test accuracies (0.91 0.90) AUC values (up 0.92), particularly NMF BioBERT. The Voting method exhibited best accuracy (0.92), confirming its robustness. Transformer-based techniques significantly improved performance compared conventional approaches like PCA Decision Trees. proposed ML enhances diagnostic interpretability, demonstrating strong generalizability dataset. These findings highlight potential precision oncology personalized diagnostics.

Язык: Английский

Процитировано

1

Open‐source large language models in action: A bioinformatics chatbot for PRIDE database DOI Creative Commons
Jingwen Bai,

Selvakumar Kamatchinathan,

Deepti J Kundu

и другие.

PROTEOMICS, Год журнала: 2024, Номер unknown

Опубликована: Март 31, 2024

ABSTRACT We here present a chatbot assistant infrastructure ( https://www.ebi.ac.uk/pride/chatbot/ ) that simplifies user interactions with the PRIDE database's documentation and dataset search functionality. The framework utilizes multiple Large Language Models (LLM): llama2, chatglm, mixtral (mistral), openhermes. It also includes web service API (Application Programming Interface), interface, components for indexing managing vector databases. An Elo‐ranking system‐based benchmark component is included in as well, which allows evaluating performance of each LLM improving documentation. not only users to interact but can be used find datasets using an LLM‐based recommendation system, enabling discoverability. Importantly, while our exemplified through its application database context, modular adaptable nature approach positions it valuable tool experiences across spectrum bioinformatics proteomics tools resources, among other domains. integration advanced LLMs, innovative vector‐based construction, benchmarking framework, optimized collectively form robust transferable infrastructure. open‐source https://github.com/PRIDE‐Archive/pride‐chatbot ).

Язык: Английский

Процитировано

6

Bioinfo-Bench: A Simple Benchmark Framework for LLM Bioinformatics Skills Evaluation DOI Creative Commons
Qiyuan Chen, Cheng Deng

bioRxiv (Cold Spring Harbor Laboratory), Год журнала: 2023, Номер unknown

Опубликована: Окт. 21, 2023

A bstract Large Language Models (LLMs) have garnered significant recognition in the life sciences for their capacity to comprehend and utilize knowledge. The contemporary expectation diverse industries extends beyond employing LLMs merely as chatbots; instead, there is a growing emphasis on harnessing potential adept analysts proficient dissecting intricate issues within these sectors. realm of bioinformatics no exception this trend. In paper, we introduce B ioinfo -B ench , novel yet straightforward benchmark framework suite crafted assess academic knowledge data mining capabilities foundational models bioinformatics. systematically gathered from three distinct perspectives: acquisition, analysis, application, facilitating comprehensive examination LLMs. Our evaluation encompassed prominent ChatGPT, Llama, Galactica. findings revealed that excel drawing heavily upon training retention. However, proficiency addressing practical professional queries conducting nuanced inference remains constrained. Given insights, are poised delve deeper into domain, engaging further extensive research discourse. It pertinent note project currently progress, all associated materials will be made publicly accessible. 1

Язык: Английский

Процитировано

11

VF-Pred: Predicting virulence factor using sequence alignment percentage and ensemble learning models DOI
Shreya Singh, Nguyen Quoc Khanh Le, Cheng Wang

и другие.

Computers in Biology and Medicine, Год журнала: 2023, Номер 168, С. 107662 - 107662

Опубликована: Ноя. 3, 2023

Язык: Английский

Процитировано

10