Why We Need to Relearn How to Talk to Machines - A Snapshot of Generative AI in January 2024 DOI Open Access
Gabriel Kalweit, Gabriel Kalweit

Journal of Science Humanities and Arts - JOSHA, Год журнала: 2024, Номер 11(2)

Опубликована: Янв. 1, 2024

The last few years have seen incredibly rapid progress in the field of generative artificial intelligence.Talking to machines and getting answers natural language is part our new, elusive normal.Driven by exponential growth both computing power internet-scale data, new digital assistants are trained estimating most likely next element a given context.Recent clearly shown that this general objective can lead ability develop complex diverse capabilities from simple principles.At same time, however, it interesting structures compression training data sometimes unpredictable artefacts.The aim article shed light on mechanisms behind current large models provide guidance how get best question.

Язык: Английский

Large Language Models lack essential metacognition for reliable medical reasoning DOI Creative Commons

Maxime Griot,

Coralie Hemptinne, Jean Vanderdonckt

и другие.

Nature Communications, Год журнала: 2025, Номер 16(1)

Опубликована: Янв. 14, 2025

Язык: Английский

Процитировано

4

A comparative study of large language model-based zero-shot inference and task-specific supervised classification of breast cancer pathology reports DOI Creative Commons
Madhumita Sushil, Travis Zack, Divneet Mandair

и другие.

Journal of the American Medical Informatics Association, Год журнала: 2024, Номер 31(10), С. 2315 - 2327

Опубликована: Июнь 20, 2024

Although supervised machine learning is popular for information extraction from clinical notes, creating large annotated datasets requires extensive domain expertise and time-consuming. Meanwhile, language models (LLMs) have demonstrated promising transfer capability. In this study, we explored whether recent LLMs could reduce the need large-scale data annotations.

Язык: Английский

Процитировано

12

Diagnostic accuracy of large language models in psychiatry DOI
Omid Kohandel Gargari, Farhad Fatehi, Ida Mohammadi

и другие.

Asian Journal of Psychiatry, Год журнала: 2024, Номер 100, С. 104168 - 104168

Опубликована: Июль 25, 2024

Язык: Английский

Процитировано

7

A comparative study of zero-shot inference with large language models and supervised modeling in breast cancer pathology classification DOI Creative Commons
Madhumita Sushil, Travis Zack, Divneet Mandair

и другие.

Research Square (Research Square), Год журнала: 2024, Номер unknown

Опубликована: Фев. 6, 2024

Although supervised machine learning is popular for information extraction from clinical notes, creating large, annotated datasets requires extensive domain expertise and time-consuming. Meanwhile, large language models (LLMs) have demonstrated promising transfer capability. In this study, we explored whether recent LLMs can reduce the need large-scale data annotations. We curated a manually labeled dataset of 769 breast cancer pathology reports, with 13 categories, to compare zero-shot classification capability GPT-4 model GPT-3.5 performance three architectures: random forests classifier, long short-term memory networks attention (LSTM-Att), UCSF-BERT model. Across all tasks, performed either significantly better than or as well best model, LSTM-Att (average macro F1 score 0.83 vs. 0.75). On tasks high imbalance between labels, differences were more prominent. Frequent sources errors included inferences multiple samples complex task design. where cannot be easily collected, burden labeling. However, if use prohibitive, simpler provide comparable results. potential speed up execution NLP studies by reducing curating datasets. This may increase utilization NLP-based variables outcomes in observational studies.

Язык: Английский

Процитировано

6

Diagnostic Accuracy of a Custom Large Language Model on Rare Pediatric Disease Case Reports DOI
Cameron C. Young,

Ellie Enichen,

Christian Rivera

и другие.

American Journal of Medical Genetics Part A, Год журнала: 2024, Номер unknown

Опубликована: Сен. 13, 2024

ABSTRACT Accurately diagnosing rare pediatric diseases frequently represent a clinical challenge due to their complex and unusual presentations. Here, we explore the capabilities of three large language models (LLMs), GPT‐4, Gemini Pro, custom‐built LLM (GPT‐4 integrated with Human Phenotype Ontology [GPT‐4 HPO]), by evaluating diagnostic performance on 61 disease case reports. The LLMs were assessed for accuracy in identifying specific diagnoses, listing correct diagnosis among differential list, broad categories. In addition, GPT‐4 HPO was tested 100 general pediatrics reports previously other further validate its performance. results indicated that able predict 13.1%, whereas both Pro had accuracies 8.2%. Further, showed an improved compared two list category. Although these findings underscore potential support, particularly when enhanced domain‐specific ontologies, they also stress need improvement prior integration into practice.

Язык: Английский

Процитировано

5

"I'm Sorry, but I Can't Assist": Bias in Generative AI DOI
Julie Smith

Опубликована: Май 13, 2024

Research Questions: (1) Is there a pattern of racial bias in student advising recommendations made by generative AI? (2) What safeguards can promote equity when using AI high-stakes decision-making? Methodology: Using lists names associated with various ethnic/racial groups, we asked ChatGPT and Claude for colleges majors each student. Results: was more likely to recommend STEM some groups. did not show systematic metrics school quality, but did. There were also overall differences the recommended ChatGPT. Implications: We provide cautions tasks.

Язык: Английский

Процитировано

4

Beyond Text Generation: Assessing Large Language Models’ Ability to Reason Logically and Follow Strict Rules DOI Creative Commons
Zhiyong Han, Fortunato Battaglia,

Kush Mansuria

и другие.

AI, Год журнала: 2025, Номер 6(1), С. 12 - 12

Опубликована: Янв. 15, 2025

The growing interest in advanced large language models (LLMs) like ChatGPT has sparked debate about how best to use them various human activities. However, a neglected issue the concerning applications of LLMs is whether they can reason logically and follow rules novel contexts, which are critical for our understanding LLMs. To address this knowledge gap, study investigates five (ChatGPT-4o, Claude, Gemini, Meta AI, Mistral) using word ladder puzzles assess their logical reasoning rule-adherence capabilities. Our two-phase methodology involves (1) explicit instructions regarding solve then evaluate rule understanding, followed by (2) assessing LLMs’ ability create while adhering rules. Additionally, we test implicitly recognize avoid HIPAA privacy violations as an example real-world scenario. findings reveal that show persistent lack systematically fail puzzle Furthermore, all except Claude prioritized task completion (text writing) over ethical considerations test. expose flaws rule-following capabilities, raising concerns reliability tasks requiring strict reasoning. Therefore, urge caution when integrating into fields highlight need further research capabilities limitations ensure responsible AI development.

Язык: Английский

Процитировано

0

Evaluating base and retrieval augmented LLMs with document or online support for evidence based neurology DOI Creative Commons
Lars Masanneck, Sven G. Meuth, Marc Pawlitzki

и другие.

npj Digital Medicine, Год журнала: 2025, Номер 8(1)

Опубликована: Март 4, 2025

Effectively managing evidence-based information is increasingly challenging. This study tested large language models (LLMs), including document- and online-enabled retrieval-augmented generation (RAG) systems, using 13 recent neurology guidelines across 130 questions. Results showed substantial variability. RAG improved accuracy compared to base but still produced potentially harmful answers. RAG-based systems performed worse on case-based than knowledge-based Further refinement regulation needed for safe clinical integration of RAG-enhanced LLMs.

Язык: Английский

Процитировано

0

JAMA Pediatrics—The Year in Review 2024 DOI
Dimitri Christakis

JAMA Pediatrics, Год журнала: 2025, Номер unknown

Опубликована: Март 17, 2025

From the standpoint of articles we published, 2024 represents first post-COVID year for JAMA Pediatrics. Articles related to COVID—its direct and indirect effects—now represent a small fraction science disseminate.

Язык: Английский

Процитировано

0

Estrategias para la mejora de la seguridad diagnóstica y del razonamiento clínico DOI Creative Commons
Pedro J. Alcalá Minagorre, María José Salmerón Fernández,

Araceli Domingo Garau

и другие.

Anales de Pediatría, Год журнала: 2025, Номер unknown, С. 503827 - 503827

Опубликована: Март 1, 2025

Процитировано

0