Assessing the feasibility of ChatGPT-4o and Claude 3-Opus in thyroid nodule classification based on ultrasound images DOI Creative Commons

Ziman Chen,

Nonhlanhla Chambara, Chaoqun Wu

et al.

Endocrine, Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 11, 2024

Abstract Purpose Large language models (LLMs) are pivotal in artificial intelligence, demonstrating advanced capabilities natural understanding and multimodal interactions, with significant potential medical applications. This study explores the feasibility efficacy of LLMs, specifically ChatGPT-4o Claude 3-Opus, classifying thyroid nodules using ultrasound images. Methods included 112 patients a total 116 nodules, comprising 75 benign 41 malignant cases. Ultrasound images these were analyzed 3-Opus to diagnose or nature nodules. An independent evaluation by junior radiologist was also conducted. Diagnostic performance assessed Cohen’s Kappa receiver operating characteristic (ROC) curve analysis, referencing pathological diagnoses. Results demonstrated poor agreement results ( = 0.116), while showed even lower 0.034). The exhibited moderate 0.450). achieved an area under ROC (AUC) 57.0% (95% CI: 48.6–65.5%), slightly outperforming (AUC 52.0%, 95% 43.2–60.9%). In contrast, significantly higher AUC 72.4% 63.7–81.1%). unnecessary biopsy rates 41.4% for ChatGPT-4o, 43.1% 12.1% radiologist. Conclusion While LLMs such as show promise future applications imaging, their current use clinical diagnostics should be approached cautiously due limited accuracy.

Language: Английский

LLMs in e-commerce: A comparative analysis of GPT and LLaMA models in product review evaluation DOI Creative Commons
Konstantinos I. Roumeliotis, Nikolaos D. Tselikas, Dimitrios Κ. Nasiopoulos

et al.

Natural Language Processing Journal, Journal Year: 2024, Volume and Issue: 6, P. 100056 - 100056

Published: Jan. 21, 2024

E-commerce has witnessed remarkable growth, especially following the easing of COVID-19 restrictions. Many people, who were initially hesitant about online shopping, have now embraced it, while existing shoppers increasingly prefer convenience e-commerce. This surge in e-commerce prompted implementation automated customer service processes, incorporating innovations such as chatbots and AI-driven sales. Despite this satisfaction remains vital for sustainability. Data scientists made progress utilizing machine learning to assess levels but struggled understand emotions within product reviews' context. The recent AI revolution, marked by release powerful Large Language Models (LLMs) public, brought us closer than ever before understanding sentiment. study aims illustrate effectiveness LLMs conducting a comparative analysis two cutting-edge LLMs, GPT-3.5 LLaMA-2, along with additional Natural Process (NLP) models, BERT RoBERTa. We evaluate performance these models after fine-tuning them specifically review sentiment analysis. primary objective research is determine if specific could contribute context an environment. By comparing we aim uncover insights into potential impact on enhance our their capabilities particular

Language: Английский

Citations

37

Leveraging generative AI for urban digital twins: a scoping review on the autonomous generation of urban data, scenarios, designs, and 3D city models for smart city advancement DOI Creative Commons
Haowen Xu, Olufemi A. Omitaomu, Soheil Sabri

et al.

Urban Informatics, Journal Year: 2024, Volume and Issue: 3(1)

Published: Oct. 14, 2024

Abstract The digital transformation of modern cities by integrating advanced information, communication, and computing technologies has marked the epoch data-driven smart city applications for efficient sustainable urban management. Despite their effectiveness, these often rely on massive amounts high-dimensional multi-domain data monitoring characterizing different sub-systems, presenting challenges in application areas that are limited quality availability, as well costly efforts generating scenarios design alternatives. As an emerging research area deep learning, Generative Artificial Intelligence (GenAI) models have demonstrated unique values content generation. This paper aims to explore innovative integration GenAI techniques twins address planning management built environments with focuses various such transportation, energy, water, building infrastructure. survey starts introduction cutting-edge generative AI models, Adversarial Networks (GAN), Variational Autoencoders (VAEs), Pre-trained Transformer (GPT), followed a scoping review existing science leverage intelligent autonomous capability facilitate research, operations, critical subsystems, holistic environment. Based review, we discuss potential opportunities technical strategies integrate into next-generation more intelligent, scalable, automated development

Language: Английский

Citations

17

Generative artificial intelligence in construction: A Delphi approach, framework, and case study DOI Creative Commons
Ridwan Taiwo, Idris Temitope Bello, Sulemana Fatoama Abdulai

et al.

Alexandria Engineering Journal, Journal Year: 2025, Volume and Issue: 116, P. 672 - 698

Published: Jan. 9, 2025

Language: Английский

Citations

2

A survey of multilingual large language models DOI Creative Commons
Libo Qin, Qiguang Chen, Yuhang Zhou

et al.

Patterns, Journal Year: 2025, Volume and Issue: 6(1), P. 101118 - 101118

Published: Jan. 1, 2025

Multilingual large language models (MLLMs) leverage advanced to process and respond queries across multiple languages, achieving significant success in polyglot tasks. Despite these breakthroughs, a comprehensive survey summarizing existing approaches recent developments remains absent. To this end, paper presents unified thorough review of the field, highlighting progress emerging trends MLLM research. The contributions are as follows. (1) Extensive survey: our knowledge, is pioneering multilingual alignment MLLMs. (2) Unified taxonomy: we provide framework summarize current (3) Emerging frontiers: key frontiers identified, alongside discussion associated challenges. (4) Abundant resources: collect abundant open-source resources, including relevant papers, data corpora, leaderboards. We hope work can community quick access spur breakthrough research

Language: Английский

Citations

2

You, Me, and the AI: The Role of Third‐Party Human Teammates for Trust Formation Toward AI Teammates DOI
Türkü Erengin, Roman Briker,

Simon B. de Jong

et al.

Journal of Organizational Behavior, Journal Year: 2025, Volume and Issue: unknown

Published: Jan. 1, 2025

ABSTRACT As artificial intelligence (AI) becomes increasingly integrated in teams, understanding the factors that drive trust formation between human and AI teammates crucial. Yet, emergent literature has overlooked impact of third parties on human‐AI teaming. Drawing from social cognitive theory teams research, we suggest how much a teammate perceives an as trustworthy, engages behaviors toward AI, determines focal employee's perceptions behavior this teammate. Additionally, propose these effects hinge trustworthiness . We test predictions across two studies: (1) online experiment comprising individuals with work experience examines disembodied trustworthiness, (2) incentivized observational study investigates embodied AI. Both studies reveal teammate's perceived of, in, strongly predict behavioral Furthermore, relationship vanishes when employees perceive their less trustworthy. These results advance our third‐party formation, providing organizations insights for managing influences teams.

Language: Английский

Citations

2

Leveraging Large Language Models for Enhancing Safety in Maritime Operations DOI Creative Commons
Tymoteusz Miller, Irmina Durlik, Ewelina Kostecka

et al.

Applied Sciences, Journal Year: 2025, Volume and Issue: 15(3), P. 1666 - 1666

Published: Feb. 6, 2025

Maritime operations play a critical role in global trade but face persistent safety challenges due to human error, environmental factors, and operational complexities. This review explores the transformative potential of Large Language Models (LLMs) enhancing maritime through improved communication, decision-making, compliance. Specific applications include multilingual communication for international crews, automated reporting, interactive training, real-time risk assessment. While LLMs offer innovative solutions, such as data privacy, integration, ethical considerations must be addressed. concludes with actionable recommendations insights leveraging build safer more resilient systems.

Language: Английский

Citations

2

ChatGPT Label: Comparing the Quality of Human-Generated and LLM-Generated Annotations in Low-Resource Language NLP Tasks DOI Creative Commons
Arbi Haza Nasution, Aytuğ Onan

IEEE Access, Journal Year: 2024, Volume and Issue: 12, P. 71876 - 71900

Published: Jan. 1, 2024

This research paper presents a comprehensive comparative study assessing the quality of annotations in Turkish, Indonesian, and Minangkabau Natural Language Processing (NLP) tasks, with specific focus on contrast between generated by human annotators those produced Large Models (LLMs). In context NLP, high-quality play pivotal role training evaluating machine-learning models. The encompasses three core NLP tasks: topic classification, tweet sentiment analysis, emotion each reflecting distinct aspect text analysis. methodology incorporates meticulously curated dataset sourced from variety data, spanning diverse topics emotions. Human annotators, proficient language, were tasked producing annotations, adhering to annotation guidelines. Additionally, fine-tuned Turkish LLMs employed generate for same tasks. evaluation process precision, recall, F1-score metrics, tailored task. findings this underscore nuanced nature quality. While LLM-generated demonstrated competitive quality, particularly human-generated consistently outperformed ones more intricate observed differences highlight LLM limitations understanding addressing ambiguity. contributes ongoing discourse sources emphasizing importance judicious selection annotations. It also underscores necessity continued advancements capabilities, as they continue reshape landscape data machine learning.

Language: Английский

Citations

12

GPT-Neo-CRV: Elevating Information Accuracy in GPT-Neo with Cross-Referential Validation DOI Creative Commons
Xingyu Xiong, Mingliang Zheng

Published: Jan. 8, 2024

This paper introduces GPT-Neo-CRV, a novel adaptation of the GPT-Neo 1.5B model, incorporating Cross-Referential Validation (CRV) module to significantly enhance accuracy and reliability information generated by Large Language Models (LLMs). GPT-Neo-CRV addresses critical challenge misinformation in LLM outputs, growing concern fields where precision are crucial. Through rigorous testing against BIG-bench categories, demonstrated marked improvements tasks requiring factual correctness complex reasoning, surpassing standard model. study delves into implications these advancements, potential limitations, ethical considerations inherent integrating validation mechanisms LLMs. It highlights need for comprehensive, unbiased, ethically curated sources emphasizes importance ongoing research enhancing LLMs' adaptability, scalability, integrity. The development represents significant step forward AI field, contributing more informed truthful digital landscape setting new standards future developments.

Language: Английский

Citations

10

Exploring the potential of large language models for improving digital forensic investigation efficiency DOI
Akila Wickramasekara, Frank Breitinger, Mark Scanlon

et al.

Forensic Science International Digital Investigation, Journal Year: 2025, Volume and Issue: 52, P. 301859 - 301859

Published: Feb. 3, 2025

Language: Английский

Citations

1

DeB3RTa: A Transformer-Based Model for the Portuguese Financial Domain DOI Creative Commons
Higo Pires, V. Leonardo Paucar, João Paulo Carvalho

et al.

Big Data and Cognitive Computing, Journal Year: 2025, Volume and Issue: 9(3), P. 51 - 51

Published: Feb. 21, 2025

The complex and specialized terminology of financial language in Portuguese-speaking markets create significant challenges for natural processing (NLP) applications, which must capture nuanced linguistic contextual information to support accurate analysis decision-making. This paper presents DeB3RTa, a transformer-based model specifically developed through mixed-domain pretraining strategy that combines extensive corpora from finance, politics, business management, accounting enable understanding language. DeB3RTa was evaluated against prominent models—including BERTimbau, XLM-RoBERTa, SEC-BERT, BusinessBERT, GPT-based variants—and consistently achieved gains across key NLP benchmarks. To maximize adaptability accuracy, integrates advanced fine-tuning techniques such as layer reinitialization, mixout regularization, stochastic weight averaging, layer-wise learning rate decay, together enhance its performance varied high-stakes tasks. These findings underscore the efficacy building high-performance models applications. With robust analytical classification tasks, offers powerful tool advancing sector supporting needs contexts.

Language: Английский

Citations

1