HIV Medicine, Journal Year: 2024, Volume and Issue: unknown
Published: Nov. 11, 2024
Language: Английский
HIV Medicine, Journal Year: 2024, Volume and Issue: unknown
Published: Nov. 11, 2024
Language: Английский
Academic Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: Feb. 1, 2025
Language: Английский
Citations
1Journal of cardiovascular computed tomography, Journal Year: 2025, Volume and Issue: unknown
Published: April 1, 2025
The Coronary Artery Disease-Reporting and Data System (CAD-RADS) 2.0 offers standardized guidelines for interpreting coronary artery disease in cardiac CT. Accurate consistent CAD-RADS scoring is crucial comprehensive characterization clinical decision-making. This study investigates the capability of large language models (LLMs) to autonomously generate scores from CT reports. A dataset reports was created evaluate performance several state-of-the-art LLMs generating via in-context learning. tested comprised GPT-3.5, GPT-4o, Mistral 7b, Mixtral 8 × 7b, Llama3 8b, 8b with a 64k context length, 70b. generated each model were compared ground truth, which provided by two board-certified cardiothoracic radiologists consensus based on final set 200 GPT-4o 70b achieved highest accuracy full including all modifiers rate 93 % 92.5 %, respectively, followed 7b 78 %. In contrast, older LLMs, such as 7b GPT-3.5 performed poorly (16 %) demonstrated intermediate results an 41.5 enhanced learning are capable excellent accuracy, potentially enhancing both efficiency consistency reporting. Open-source not only deliver competitive but also present benefit local hosting, mitigating concerns around data security.
Language: Английский
Citations
1Journal of Medical Screening, Journal Year: 2025, Volume and Issue: unknown
Published: April 21, 2025
Some noteworthy studies have questioned the use of ChatGPT, a free artificial intelligence program that has become very popular and widespread in recent times, different branches medicine. In this study, success ChatGPT detecting breast cancer on mammography (MMG) was evaluated. The pre-treatment mammographic images patients with histopathological diagnosis invasive carcinoma prominent mass formation MMG were read separately into two subprograms: Radiologist Report Writer (P1) XrayGPT (P2). programs asked to determine density, tumor size, side, quadrant, presence microcalcification, distortion, skin or nipple changes, axillary lymphadenopathy (LAP), BI-RADS score. responses evaluated consensus by experienced radiologists. Although detection rate both over 60%, determining size localization, LAP low. category agreement readers fair for P1 (κ:28%, 0.20< κ ≤ 0.40) moderate P2 (κ:58%, 0.40< 0.60). conclusion, while application can detect appearance better than application, is low all other related features. This casts doubt suitability current large language models image analysis screening.
Language: Английский
Citations
1European Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: Jan. 2, 2025
Language: Английский
Citations
0European Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: Jan. 2, 2025
Language: Английский
Citations
0Japanese Journal of Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: March 8, 2025
Large language models (LLMs) have the potential to objectively evaluate radiology resident reports; however, research on their use for feedback in training and assessment of skill development remains limited. This study aimed assess effectiveness LLMs revising reports by comparing them with verified board-certified radiologists analyze progression resident's reporting skills over time. To identify LLM that best aligned human radiologists, 100 were randomly selected from 7376 authored nine first-year residents. The evaluated based six criteria: (1) addition missing positive findings, (2) deletion (3) negative (4) correction expression (5) diagnosis, (6) proposal additional examinations or treatments. Reports segmented into four time-based terms, 900 (450 CT 450 MRI) chosen initial final terms residents' first year. revised rates each criterion compared between last using Wilcoxon Signed-Rank test. Among three LLMs-ChatGPT-4 Omni (GPT-4o), Claude-3.5 Sonnet, Claude-3 Opus-GPT-4o demonstrated highest level agreement radiologists. Significant improvements noted Criteria 1-3 when (Criteria 1, 2, 3; P < 0.001, = 0.023, 0.004, respectively) GPT-4o. No significant changes observed 4-6. Despite this, all criteria except 6 showed progressive enhancement can effectively provide commonly corrected areas reports, enabling residents improve weaknesses monitor progress. Additionally, may help reduce workload radiologists' mentors.
Language: Английский
Citations
0npj Digital Medicine, Journal Year: 2025, Volume and Issue: 8(1)
Published: March 22, 2025
Abstract While generative artificial intelligence (AI) has shown potential in medical diagnostics, comprehensive evaluation of its diagnostic performance and comparison with physicians not been extensively explored. We conducted a systematic review meta-analysis studies validating AI models for tasks published between June 2018 2024. Analysis 83 revealed an overall accuracy 52.1%. No significant difference was found ( p = 0.10) or non-expert 0.93). However, performed significantly worse than expert 0.007). Several demonstrated slightly higher compared to non-experts, although the differences were significant. Generative demonstrates promising capabilities varying by model. Although it yet achieved expert-level reliability, these findings suggest enhancing healthcare delivery education when implemented appropriate understanding limitations.
Language: Английский
Citations
0JMIR Medical Informatics, Journal Year: 2025, Volume and Issue: 13, P. e64963 - e64963
Published: April 25, 2025
Abstract Background With the rapid development of artificial intelligence (AI) technology, especially generative AI, large language models (LLMs) have shown great potential in medical field. Through massive data training, it can understand complex texts and quickly analyze records provide health counseling diagnostic advice directly, rare diseases. However, no study has yet compared extensively discussed performance LLMs with that physicians. Objective This systematically reviewed accuracy clinical diagnosis provided reference for further application. Methods We conducted searches CNKI (China National Knowledge Infrastructure), VIP Database, SinoMed, PubMed, Web Science, Embase, CINAHL (Cumulative Index to Nursing Allied Health Literature) from January 1, 2017, present. A total 2 reviewers independently screened literature extracted relevant information. The risk bias was assessed using Prediction Model Risk Bias Assessment Tool (PROBAST), which evaluates both applicability included studies. Results 30 studies involving 19 a 4762 cases were included. quality assessment indicated high majority studies, primary cause is known case diagnosis. For optimal model, ranged 25% 97.8%, while triage 66.5% 98%. Conclusions demonstrated considerable capabilities significant application across various cases. Although their still falls short professionals, if used cautiously, they become one best intelligent assistants field human care.
Language: Английский
Citations
0European Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: May 8, 2025
Abstract The integration of machine-learning technologies into radiology practice has the potential to significantly enhance diagnostic workflows and patient care. However, successful deployment maintenance medical (MedML) systems in requires robust operational frameworks. Medical operations (MedMLOps) offer a structured approach ensuring persistent MedML reliability, safety, clinical relevance. are increasingly employed analyse sensitive radiological data, which continuously changes due advancements data acquisition model development. These can alleviate workload radiologists by streamlining tasks, such as image interpretation triage. MedMLOps ensures that stay accurate dependable facilitating continuous performance monitoring, systematic validation, simplified maintenance—all critical maintaining trust machine-learning-driven diagnostics. Furthermore, aligns with established principles protection regulatory compliance, including recent developments European Union, emphasising transparency, documentation, safe retraining. This enables implement modern tools control oversight at forefront, reliable within dynamic context practice. empowers deliver consistent, high-quality care confidence, aligned evolving standards needs. assist multiple stakeholders models available, monitored easy use maintain while preserving privacy. better serve patients implementation cutting-edge clinicians only utilised when they performing expected. Key Points Question applications becoming adopted clinics, but necessary infrastructure sustain these is currently not well-defined . Findings Adapting machine learning concepts enhances ecosystems improving interoperability, automating monitoring/validation, reducing burdens on informaticians Clinical relevance Implementing solutions eases faster safer adoption advanced models, consistent for clinicians, benefiting through streamlined
Language: Английский
Citations
0Japanese Journal of Radiology, Journal Year: 2025, Volume and Issue: unknown
Published: May 14, 2025
Language: Английский
Citations
0