Assessing the feasibility of ChatGPT-4o and Claude 3-Opus in thyroid nodule classification based on ultrasound images DOI Creative Commons

Ziman Chen,

Nonhlanhla Chambara, Chaoqun Wu

и другие.

Endocrine, Год журнала: 2024, Номер unknown

Опубликована: Окт. 11, 2024

Abstract Purpose Large language models (LLMs) are pivotal in artificial intelligence, demonstrating advanced capabilities natural understanding and multimodal interactions, with significant potential medical applications. This study explores the feasibility efficacy of LLMs, specifically ChatGPT-4o Claude 3-Opus, classifying thyroid nodules using ultrasound images. Methods included 112 patients a total 116 nodules, comprising 75 benign 41 malignant cases. Ultrasound images these were analyzed 3-Opus to diagnose or nature nodules. An independent evaluation by junior radiologist was also conducted. Diagnostic performance assessed Cohen’s Kappa receiver operating characteristic (ROC) curve analysis, referencing pathological diagnoses. Results demonstrated poor agreement results ( = 0.116), while showed even lower 0.034). The exhibited moderate 0.450). achieved an area under ROC (AUC) 57.0% (95% CI: 48.6–65.5%), slightly outperforming (AUC 52.0%, 95% 43.2–60.9%). In contrast, significantly higher AUC 72.4% 63.7–81.1%). unnecessary biopsy rates 41.4% for ChatGPT-4o, 43.1% 12.1% radiologist. Conclusion While LLMs such as show promise future applications imaging, their current use clinical diagnostics should be approached cautiously due limited accuracy.

Язык: Английский

Contextual Hypergraph Networks for Enhanced Extractive Summarization: Introducing Multi-Element Contextual Hypergraph Extractive Summarizer (MCHES) DOI Creative Commons
Aytuğ Onan, Hesham Alhumyani

Applied Sciences, Год журнала: 2024, Номер 14(11), С. 4671 - 4671

Опубликована: Май 29, 2024

Extractive summarization, a pivotal task in natural language processing, aims to distill essential content from lengthy documents efficiently. Traditional methods often struggle with capturing the nuanced interdependencies between different document elements, which is crucial producing coherent and contextually rich summaries. This paper introduces Multi-Element Contextual Hypergraph Summarizer (MCHES), novel framework designed address these challenges through an advanced hypergraph-based approach. MCHES constructs contextual hypergraph where sentences form nodes interconnected by multiple types of hyperedges, including semantic, narrative, discourse hyperedges. structure captures complex relationships maintains narrative flow, enhancing semantic coherence across summary. The incorporates Homogenization Module (CHM), harmonizes features diverse Attention (HCA), employs dual-level attention mechanism focus on most salient information. innovative Read-out Strategy selects optimal set compose final summary, ensuring that latter reflects core themes logical original text. Our extensive evaluations demonstrate significant improvements over existing methods. Specifically, achieves average ROUGE-1 score 44.756, ROUGE-2 24.963, ROUGE-L 42.477 CNN/DailyMail dataset, surpassing best-performing baseline 3.662%, 3.395%, 2.166% respectively. Furthermore, BERTScore values 59.995 CNN/DailyMail, 88.424 XSum, 89.285 PubMed, indicating superior alignment human-generated Additionally, MoverScore 87.432 60.549 59.739 highlighting its effectiveness maintaining movement ordering. These results confirm sets new standard for extractive summarization leveraging hypergraphs better thematic fidelity.

Язык: Английский

Процитировано

7

GeoLLM: A specialized large language model framework for intelligent geotechnical design DOI
Haitao Xu, Ning Zhang, Zhenyu Yin

и другие.

Computers and Geotechnics, Год журнала: 2024, Номер 177, С. 106849 - 106849

Опубликована: Окт. 24, 2024

Язык: Английский

Процитировано

7

Precision-Driven Product Recommendation Software: Unsupervised Models, Evaluated by GPT-4 LLM for Enhanced Recommender Systems DOI Creative Commons
Konstantinos I. Roumeliotis, Nikolaos D. Tselikas, Dimitrios Κ. Nasiopoulos

и другие.

Software, Год журнала: 2024, Номер 3(1), С. 62 - 80

Опубликована: Фев. 29, 2024

This paper presents a pioneering methodology for refining product recommender systems, introducing synergistic integration of unsupervised models—K-means clustering, content-based filtering (CBF), and hierarchical clustering—with the cutting-edge GPT-4 large language model (LLM). Its innovation lies in utilizing evaluation, harnessing its advanced natural understanding capabilities to enhance precision relevance recommendations. A flask-based API simplifies implementation e-commerce owners, allowing seamless training evaluation models using CSV-formatted data. The unique aspect this approach ability empower with sophisticated system algorithms, while GPT significantly contributes semantic context features, resulting more personalized effective recommendation system. experimental results underscore superiority integrated framework, marking significant advancement field systems providing businesses an efficient scalable solution optimize their

Язык: Английский

Процитировано

6

Exploring the Use of Large Language Model-Driven Chatbots in Virtual Reality to Train Autistic Individuals in Job Communication Skills DOI Open Access
Ziming Li, Pinaki Prasanna Babar, Mike Barry

и другие.

Опубликована: Май 11, 2024

Autistic individuals commonly encounter challenges in communicating with others which can lead to difficulties obtaining and maintaining jobs. Thus, job training programs have emphasized the communication skills of autistic improve their employability. Hence, we developed a virtual reality application that features avatars as chatbots powered by Large Language Models (LLMs), such GPT-3.5 Turbo, employs speech-based interactions users. The use LLM-driven allows coaches create scenarios for trainees using text prompts. We conducted preliminary study three two gather early-stage feedback on application's usability user experience. In study, trainee participants were asked interact involving customer interactions. Our findings indicate our shows promise communication. Furthermore, discuss its experience aspects from trainees' coaches' perspectives.

Язык: Английский

Процитировано

6

Evaluating LLMs on document-based QA: Exact answer selection and numerical extraction using CogTale dataset DOI Creative Commons
Zafaryab Rasool, Stefanus Kurniawan, Sherwin Balugo

и другие.

Natural Language Processing Journal, Год журнала: 2024, Номер 8, С. 100083 - 100083

Опубликована: Июнь 9, 2024

Document-based Question-Answering (QA) tasks are crucial for precise information retrieval. While some existing work focus on evaluating large language model's (LLMs) performance retrieving and answering questions from documents, assessing the LLMs QA types that require exact answer selection predefined options numerical extraction is yet to be fully assessed. In this paper, we specifically underexplored context conduct empirical analysis of (GPT-4 GPT-3.5) question types, including single-choice, yes–no, multiple-choice, number documents. We use CogTale dataset evaluation, which provide human expert-tagged responses, offering a robust benchmark precision factual grounding. found LLMs, particularly GPT-4, can precisely many single-choice yes–no given relevant context, demonstrating their efficacy in retrieval tasks. However, diminishes when confronted with multiple-choice formats, lowering overall models task, indicating these may not sufficiently reliable task. This limits applications demanding inference such as meta-analysis Our offers framework ongoing ensuring LLM document continue meet evolving standards.

Язык: Английский

Процитировано

6

Large language model to multimodal large language model: A journey to shape the biological macromolecules to biological sciences and medicine DOI Creative Commons
Manojit Bhattacharya, Soumen Pal, Srijan Chatterjee

и другие.

Molecular Therapy — Nucleic Acids, Год журнала: 2024, Номер 35(3), С. 102255 - 102255

Опубликована: Июнь 15, 2024

After ChatGPT was released, large language models (LLMs) became more popular. Academicians use or LLM for different purposes, and the of is increasing from medical science to diversified areas. Recently, multimodal (MLLM) has also become Therefore, we comprehensively illustrate MLLM a complete understanding. We aim simple extended reviews LLMs MLLMs broad category readers, such as researchers, students in fields, other academicians. The review article illustrates models, their working principles, applications fields. First, demonstrate technical concept LLMs, principle, Black Box, evolution LLMs. To explain discuss tokenization process, token representation, relationships. extensively application biological macromolecules, science, MLLMs. Finally, limitations, challenges, future prospects acts booster dose clinicians, primer molecular biologists, catalyst scientists, benefits

Язык: Английский

Процитировано

6

Relation extraction using large language models: a case study on acupuncture point locations DOI
Yiming Li, Xueqing Peng, Jianfu Li

и другие.

Journal of the American Medical Informatics Association, Год журнала: 2024, Номер 31(11), С. 2622 - 2631

Опубликована: Авг. 29, 2024

Abstract Objective In acupuncture therapy, the accurate location of acupoints is essential for its effectiveness. The advanced language understanding capabilities large models (LLMs) like Generative Pre-trained Transformers (GPTs) and Llama present a significant opportunity extracting relations related to acupoint locations from textual knowledge sources. This study aims explore performance LLMs in acupoint-related assess impact fine-tuning on GPT’s performance. Materials Methods We utilized World Health Organization Standard Acupuncture Point Locations Western Pacific Region (WHO Standard) as our corpus, which consists descriptions 361 acupoints. Five types (“direction_of”, “distance_of”, “part_of”, “near_acupoint”, “located_near”) (n = 3174) between were annotated. Four compared: pre-trained GPT-3.5, fine-tuned GPT-4, well pretrained 3. Performance metrics included micro-average exact match precision, recall, F1 scores. Results Our results demonstrate that GPT-3.5 consistently outperformed other scores across all relation types. Overall, it achieved highest score 0.92. Discussion superior model, shown by scores, underscores importance domain-specific enhancing extraction acupuncture-related tasks. light findings this study, offers valuable insights into leveraging developing clinical decision support creating educational modules acupuncture. Conclusion effectiveness GPT locations, with implications accurately modeling promoting standard implementation training practice. also contribute advancing informatics applications traditional complementary medicine, showcasing potential natural processing.

Язык: Английский

Процитировано

6

Zero-Shot Strike: Testing the generalisation capabilities of out-of-the-box LLM models for depression detection DOI
Julia Ohse, Bakir Hadžić, Parvez Mohammed

и другие.

Computer Speech & Language, Год журнала: 2024, Номер 88, С. 101663 - 101663

Опубликована: Май 11, 2024

Язык: Английский

Процитировано

5

The Personality Dimensions GPT-3 Expresses During Human-Chatbot Interactions DOI Open Access
Nikola Kovačević, Christian Holz, Markus Groß

и другие.

Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies, Год журнала: 2024, Номер 8(2), С. 1 - 36

Опубликована: Май 13, 2024

Large language models such as GPT-3 and ChatGPT can mimic human-to-human conversation with unprecedented fidelity, which enables many applications conversational agents for education non-player characters in video games. In this work, we investigate the underlying personality structure that a GPT-3-based chatbot expresses during conversations human. We conducted user study to collect 147 descriptors from 86 participants while they interacted three weeks. Then, 425 new rated an online survey. exploratory factor analysis on collected show that, though overlapping, human do not fully transfer chatbot's perceived by humans. also is significantly different of virtual personal assistants, where users focus rather serviceability functionality. discuss implications ever-evolving large change affect users' perception agent personalities.

Язык: Английский

Процитировано

5

DeepExtract: Semantic-driven extractive text summarization framework using LLMs and hierarchical positional encoding DOI Creative Commons
Aytuğ Onan, Hesham Alhumyani

Journal of King Saud University - Computer and Information Sciences, Год журнала: 2024, Номер 36(8), С. 102178 - 102178

Опубликована: Авг. 30, 2024

In the age of information overload, ability to distill essential content from extensive texts is invaluable. DeepExtract introduces an advanced framework for extractive summarization, utilizing groundbreaking capabilities GPT-4 along with innovative hierarchical positional encoding redefine extraction. This manuscript details development DeepExtract, which integrates semantic-driven techniques analyze and summarize complex documents effectively. The structured around a novel tree construction that categorizes sentences sections not just by their physical placement within text, but contextual thematic significance, leveraging dynamic embeddings generated GPT-4. We introduce multi-faceted scoring system evaluates based on coherence, relevance, novelty, ensuring summaries are only concise rich content. Further, employs optimized semantic clustering group elements, enhances representativeness summaries. paper demonstrates through comprehensive evaluations significantly outperforms existing summarization models in terms accuracy efficiency, making it potent tool academic, professional, general use. conclude discussion practical applications various domains, highlighting its adaptability potential navigating vast expanses digital text.

Язык: Английский

Процитировано

5