Prediction of Hemolytic Peptides and their Hemolytic Concentration (HC50) DOI Creative Commons
Anand Singh Rathore, Nishant Kumar, Shubham Choudhury

et al.

bioRxiv (Cold Spring Harbor Laboratory), Journal Year: 2024, Volume and Issue: unknown

Published: July 24, 2024

Several peptide-based drugs fail in clinical trials due to their toxicity or hemolytic activity against red blood cells (RBCs). Existing methods predict peptides but not the concentration (HC50) required lyse 50% of RBCs. In this study, we developed a classification model and regression identify quantify peptides. Our models were trained validated on 1924 with experimentally determined HC50 mammalian Analysis indicates that hydrophobic positively charged residues associated higher activity. achieved maximum AUC 0.909 using hybrid ESM-2 motif-based approach. Regression compositional features R 0.739 R2 0.543. outperform existing are implemented web-based platform HemoPI2 standalone software for designing desired values (http://webs.iiitd.edu.in/raghava/hemopi2/).

Language: Английский

Multimodal Large Language Models in Healthcare: Applications, Challenges, and Future Outlook (Preprint) DOI Creative Commons
Rawan AlSaad, Alaa Abd‐Alrazaq, Sabri Boughorbel

et al.

Journal of Medical Internet Research, Journal Year: 2024, Volume and Issue: 26, P. e59505 - e59505

Published: Aug. 20, 2024

In the complex and multidimensional field of medicine, multimodal data are prevalent crucial for informed clinical decisions. Multimodal span a broad spectrum types, including medical images (eg, MRI CT scans), time-series sensor from wearable devices electronic health records), audio recordings heart respiratory sounds patient interviews), text notes research articles), videos surgical procedures), omics genomics proteomics). While advancements in large language models (LLMs) have enabled new applications knowledge retrieval processing field, most LLMs remain limited to unimodal data, typically text-based content, often overlook importance integrating diverse modalities encountered practice. This paper aims present detailed, practical, solution-oriented perspective on use (M-LLMs) field. Our investigation spanned M-LLM foundational principles, current potential applications, technical ethical challenges, future directions. By connecting these elements, we aimed provide comprehensive framework that links aspects M-LLMs, offering unified vision their care. approach guide both practical implementations M-LLMs care, positioning them as paradigm shift toward integrated, data–driven We anticipate this work will spark further discussion inspire development innovative approaches next generation systems.

Language: Английский

Citations

27

From Genotype to Phenotype: Raman Spectroscopy and Machine Learning for Label-Free Single-Cell Analysis DOI
Yirui Zhang, Kai Chang, Babatunde Ogunlade

et al.

ACS Nano, Journal Year: 2024, Volume and Issue: 18(28), P. 18101 - 18117

Published: July 1, 2024

Raman spectroscopy has made significant progress in biosensing and clinical research. Here, we describe how surface-enhanced (SERS) assisted with machine learning (ML) can expand its capabilities to enable interpretable insights into the transcriptome, proteome, metabolome at single-cell level. We first review advances nanophotonics-including plasmonics, metamaterials, metasurfaces-enhance scattering for rapid, strong label-free spectroscopy. then discuss ML approaches precise spectral analysis, including neural networks, perturbation gradient algorithms, transfer learning. provide illustrative examples of phenotyping using nanophotonics ML, bacterial antibiotic susceptibility predictions, stem cell expression profiles, cancer diagnostics, immunotherapy efficacy toxicity predictions. Lastly, exciting prospects future spectroscopy, instrumentation, self-driving laboratories, data banks, uncovering biological insights.

Language: Английский

Citations

17

The role of large language models in medical genetics DOI
Rona Merdler-Rabinowicz, Mahmud Omar,

Jaya Ganesh

et al.

Molecular Genetics and Metabolism, Journal Year: 2025, Volume and Issue: unknown, P. 109098 - 109098

Published: March 1, 2025

Language: Английский

Citations

0

Comprehensive benchmarking of large language models for RNA secondary structure prediction DOI Creative Commons

Luciano I Zablocki,

Leandro A. Bugnon, M. Gérard

et al.

Briefings in Bioinformatics, Journal Year: 2025, Volume and Issue: 26(2)

Published: March 1, 2025

In recent years, inspired by the success of large language models (LLMs) for DNA and proteins, several LLMs RNA have also been developed. These take massive datasets as inputs learn, in a self-supervised way, how to represent each base with semantically rich numerical vector. This is done under hypothesis that obtaining high-quality representations can enhance data-costly downstream tasks, such fundamental secondary structure prediction problem. However, existing RNA-LLM not evaluated this task unified experimental setup. Since they are pretrained models, assessment their generalization capabilities on new structures crucial aspect. Nonetheless, has just partially addressed literature. work we present comprehensive comparative analysis recently proposed. We evaluate use these common deep learning architecture. The were assessed increasing difficulty benchmark datasets. Results showed two clearly outperform other revealed significant challenges low-homology scenarios. Moreover, study provide curated complexity setup scientific endeavor. Source code available repository: https://github.com/sinc-lab/rna-llm-folding/.

Language: Английский

Citations

0

AI-assisted evidence screening method for systematic reviews in environmental research: integrating ChatGPT with domain knowledge DOI Creative Commons
Chen Zuo, Xiaohao Yang,

Josh Errickson

et al.

Environmental Evidence, Journal Year: 2025, Volume and Issue: 14(1)

Published: April 15, 2025

Abstract Systematic reviews (SRs) in environmental science is challenging due to diverse methodologies, terminologies, and study designs across disciplines. A major limitation that inconsistent application of eligibility criteria evidence-screening affects the reproducibility transparency SRs. To explore potential role Artificial Intelligence (AI) applying criteria, we developed evaluated an AI-assisted framework using a case SR on relationship between stream fecal coliform concentrations land use cover (LULC). The incorporates publications from hydrology, ecology, public health, landscape, urban planning, reflecting interdisciplinary nature research. We fine-tuned ChatGPT-3.5 Turbo model with expert-reviewed training data for title, abstract, full-text screening 120 articles. AI demonstrated substantial agreement at title/abstract review moderate expert reviewers maintained internal consistency, suggesting its structured assistance. findings provide consistently, improving evidence efficiency, reducing labor costs, informing large language models (LLMs) integration Combining domain knowledge provides exploratory step evaluate feasibility screening, especially diverse, volume, studies. Additionally, has approach managing disagreement among researchers knowledge, though further validation needed.

Language: Английский

Citations

0

Artificial-intelligence-driven innovations in mechanistic computational modeling and digital twins for biomedical applications DOI

Bhanwar Lal Puniya

Journal of Molecular Biology, Journal Year: 2025, Volume and Issue: unknown, P. 169181 - 169181

Published: April 1, 2025

Language: Английский

Citations

0

Prediction of hemolytic peptides and their hemolytic concentration DOI Creative Commons
Anand Singh Rathore, Nishant Kumar, Shubham Choudhury

et al.

Communications Biology, Journal Year: 2025, Volume and Issue: 8(1)

Published: Feb. 4, 2025

Peptide-based drugs often fail in clinical trials due to their toxicity or hemolytic activity against red blood cells (RBCs). Existing methods predict peptides but not the concentration (HC50) required lyse 50% of RBCs. This study develops classification and regression models identify quantify activity. These train on 1926 with experimentally determined HC50 mammalian Analysis indicates that hydrophobic positively charged residues were associated higher Among models, including machine learning (ML), quantum ML, protein language a hybrid model combining random forest (RF) motif-based approach achieves highest area under receiver operating characteristic curve (AUROC) 0.921. Regression achieve Pearson correlation coefficient (R) 0.739 determination (R²) 0.543. outperform existing are implemented HemoPI2, web-based platform standalone software for designing desired values ( http://webs.iiitd.edu.in/raghava/hemopi2/ ).

Language: Английский

Citations

0

Ai-enabled language models (LMs) to large language models (LLMs) and multimodal large language models (MLLMs) in drug discovery and development DOI Creative Commons
Chiranjib Chakraborty, Manojit Bhattacharya, Soumen Pal

et al.

Journal of Advanced Research, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 1, 2025

Language: Английский

Citations

0

UniMap: Type‐Level Integration Enhances Biological Preservation and Interpretability in Single‐Cell Annotation DOI Creative Commons
Haitao Hu, Yue Guo, Fujing Ge

et al.

Advanced Science, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 27, 2025

Abstract Integrating single‐cell datasets from multiple studies provides a cost‐effective way to build comprehensive cell atlases, granting deeper insights into cellular characteristics across diverse biological systems. However, current data integration methods struggle with interference in partially overlapping and varying annotation granularities. Here, multiselective adversarial network is introduced for the first time present UniMap, which functions as “discerner” identify exclude interfering cells various sources during dataset integration. Compared other state‐of‐the‐art methods, UniMap emphasizes type‐level proves be best model preserving variability, achieving noticeably higher accuracy automated under circumstances. Additionally, it enhances interpretability by revealing shared domain‐specific types providing prediction confidence. The efficacy of demonstrated terms identifying new types, creating high‐resolution annotating along developmental trajectories, performing cross‐species analysis, underscoring its potential robust tool research.

Language: Английский

Citations

0

A KAN-based hybrid deep neural networks for accurate identification of transcription factor binding sites DOI Creative Commons
Guodong He, Jiahao Ye,

Huijun Hao

et al.

PLoS ONE, Journal Year: 2025, Volume and Issue: 20(5), P. e0322978 - e0322978

Published: May 7, 2025

Background Predicting protein-DNA binding sites in vivo is a challenging but urgent task many fields such as drug design and development. Most promoters contain transcription factor (TF) sites, yet only few have been identified through time-consuming biochemical experiments. To address this challenge, numerous computational approaches proposed to predict TF from DNA sequences. However, current deep learning methods often face issues gradient vanishing the model depth increases, leading suboptimal feature extraction. Results We propose called CBR-KAN (where C represents Convolutional Neural Network (CNN), B Bidirectional Long Short Term Memory (BiLSTM), R Residual Mechanism) sites. Specifically, we designed multi-scale convolution module (ConvBlock1, 2, 3) combined with BiLSTM network, introduced KAN network replace traditional multilayer perceptron, promoted optimization residual connections. Testing on 50 common ChIP seq benchmark datasets shows that outperforms other state-of-the-art DeepBind, DanQ, DeepD2V, DeepSEA predicting Conclusions The significantly improves prediction accuracy for by effectively integrating multiple neural architectures mechanisms. This approach not enhances extraction also stabilizes training boosts generalization capabilities. promising results key performance indicators demonstrate potential of bioinformatics applications.

Language: Английский

Citations

0