Exploring the Potential of Large Language Models in Verbal Intelligence Assessment: A Preliminary Investigation (Preprint) DOI
Dorit Hadar‐Shoval, Maya Lvovsky, Kfir Asraf

et al.

Published: Nov. 11, 2024

BACKGROUND Cognitive assessment is an important component of applied psychology, but limited access and high costs make these evaluations challenging. OBJECTIVE This pilot study examined the feasibility using large language models (LLMs) to create personalized AI-based verbal comprehension tests (AI-BVCTs) for assessing intelligence, in contrast with traditional methods based on standardized norms. METHODS We used a within-subject design, comparing scores obtained from AI-BVCTs those Wechsler Adult Intelligence Scale (WAIS-III) Verbal Comprehension Index (VCI). RESULTS The concordance correlation coefficient (CCC) demonstrated strong agreement between AI-BVCT VCI (Claude: CCC = .752, 90% CI [.266, .933]; GPT-4: .733, [.170, .935]). Pearson correlations further supported findings, showing associations r .844, p < .001; .771, .025). No statistically significant differences were found (p > .05). These findings support potential LLMs assess intelligence. CONCLUSIONS attests promise cognitive increasing accessibility affordability processes, enabling testing. research also raises ethical concerns regarding privacy over-reliance AI clinical work. Further larger more diverse samples needed establish validity reliability this approach develop accurate scoring procedures.

Language: Английский

The Artificial Third: A Broad View of the Effects of Introducing Generative Artificial Intelligence on Psychotherapy DOI Creative Commons
Yuval Haber, Inbar Levkovich, Dorit Hadar‐Shoval

et al.

JMIR Mental Health, Journal Year: 2024, Volume and Issue: 11, P. e54781 - e54781

Published: April 18, 2024

This paper explores a significant shift in the field of mental health general and psychotherapy particular following generative artificial intelligence's new capabilities processing generating humanlike language. Following Freud, this lingo-technological development is conceptualized as "fourth narcissistic blow" that science inflicts on humanity. We argue blow has potentially dramatic influence perceptions human society, interrelationships, self. should, accordingly, expect changes therapeutic act emergence what we term third psychotherapy. The introduction an marks critical juncture, prompting us to ask important core questions address two basic elements thinking, namely, transparency autonomy: (1) What presence therapy relationships? (2) How does it reshape our perception ourselves interpersonal dynamics? (3) remains irreplaceable at therapy? Given ethical implications arise from these questions, proposes can be valuable asset when applied with insight consideration, enhancing but not replacing touch therapy.

Language: Английский

Citations

27

Editorial: Responsible Design, Integration, and Use of Generative AI in Mental Health (Preprint) DOI Creative Commons
Oren Asman, John Torous, Amir Tal

et al.

JMIR Mental Health, Journal Year: 2025, Volume and Issue: 12, P. e70439 - e70439

Published: Jan. 6, 2025

Abstract Generative artificial intelligence (GenAI) shows potential for personalized care, psychoeducation, and even crisis prediction in mental health, yet responsible use requires ethical consideration deliberation perhaps governance. This is the first published theme issue focused on GenAI health. It brings together evidence insights GenAI’s capabilities, such as emotion recognition, therapy-session summarization, risk assessment, while highlighting sensitive nature of health data need rigorous validation. Contributors discuss how bias, alignment with human values, transparency, empathy must be carefully addressed to ensure ethically grounded, intelligence–assisted care. By proposing conceptual frameworks; best practices; regulatory approaches, including ethics care preservation socially important humanistic elements, this underscores that can complement, rather than replace, vital role clinical settings. To achieve this, an ongoing collaboration between researchers, clinicians, policy makers, technologists essential.

Language: Английский

Citations

4

An Ethical Perspective on the Democratization of Mental Health With Generative AI DOI Creative Commons
Zohar Elyoseph, Tamar Gur, Yuval Haber

et al.

JMIR Mental Health, Journal Year: 2024, Volume and Issue: 11, P. e58011 - e58011

Published: July 24, 2024

Knowledge has become more open and accessible to a large audience with the "democratization of information" facilitated by technology. This paper provides sociohistorical perspective for theme issue "Responsible Design, Integration, Use Generative AI in Mental Health." It evaluates ethical considerations using generative artificial intelligence (GenAI) democratization mental health knowledge practice. explores historical context democratizing information, transitioning from restricted access widespread availability due internet, open-source movements, most recently, GenAI technologies such as language models. The highlights why represent new phase movement, offering unparalleled highly advanced technology well information. In realm health, this requires delicate nuanced deliberation. Including may allow, among other things, improved accessibility care, personalized responses, conceptual flexibility, could facilitate flattening traditional hierarchies between care providers patients. At same time, it also entails significant risks challenges that must be carefully addressed. To navigate these complexities, proposes strategic questionnaire assessing intelligence-based applications. tool both benefits risks, emphasizing need balanced approach integration health. calls cautious yet positive advocating active engagement professionals guiding development. emphasizes importance ensuring advancements are not only technologically sound but ethically grounded patient-centered.

Language: Английский

Citations

14

Evaluating Diagnostic Accuracy and Treatment Efficacy in Mental Health: A Comparative Analysis of Large Language Model Tools and Mental Health Professionals DOI Creative Commons
Inbar Levkovich

European Journal of Investigation in Health Psychology and Education, Journal Year: 2025, Volume and Issue: 15(1), P. 9 - 9

Published: Jan. 18, 2025

Large language models (LLMs) offer promising possibilities in mental health, yet their ability to assess disorders and recommend treatments remains underexplored. This quantitative cross-sectional study evaluated four LLMs (Gemini 2.0 Flash Experimental), Claude (Claude 3.5 Sonnet), ChatGPT-3.5, ChatGPT-4) using text vignettes representing conditions such as depression, suicidal ideation, early chronic schizophrenia, social phobia, PTSD. Each model’s diagnostic accuracy, treatment recommendations, predicted outcomes were compared with norms established by health professionals. Findings indicated that for certain conditions, including depression PTSD, like ChatGPT-4 achieved higher accuracy human However, more complex cases, LLM performance varied, achieving only 55% while other professionals performed better. tended suggest a broader range of proactive treatments, whereas recommended targeted psychiatric consultations specific medications. In terms outcome predictions, generally optimistic regarding full recovery, especially treatment, lower recovery rates partial rates, particularly untreated cases. While range, conservative highlight the need professional oversight. provide valuable support diagnostics planning but cannot replace discretion.

Language: Английский

Citations

2

The impact of history of depression and access to weapons on suicide risk assessment: a comparison of ChatGPT-3.5 and ChatGPT-4 DOI Creative Commons
Shiri Shinan‐Altman, Zohar Elyoseph, Inbar Levkovich

et al.

PeerJ, Journal Year: 2024, Volume and Issue: 12, P. e17468 - e17468

Published: May 29, 2024

The aim of this study was to evaluate the effectiveness ChatGPT-3.5 and ChatGPT-4 in incorporating critical risk factors, namely history depression access weapons, into suicide assessments. Both models assessed using scenarios that featured individuals with without a weapons. estimated likelihood suicidal thoughts, attempts, serious suicide-related mortality on Likert scale. A multivariate three-way ANOVA analysis Bonferroni post hoc tests conducted examine impact forementioned independent factors (history weapons) these outcome variables. identified as significant factor. demonstrated more nuanced understanding relationship between depression, risk. In contrast, displayed limited insight complex relationship. consistently assigned higher severity ratings variables than did ChatGPT-3.5. highlights potential two models, particularly ChatGPT-4, enhance assessment by considering factors.

Language: Английский

Citations

6

Can large language models be sensitive to culture suicide risk assessment? DOI Creative Commons
Inbar Levkovich, Shiri Shinan‐Altman, Zohar Elyoseph

et al.

Journal of Cultural Cognitive Science, Journal Year: 2024, Volume and Issue: 8(3), P. 275 - 287

Published: Nov. 2, 2024

Abstract Suicide remains a pressing global public health issue. Previous studies have shown the promise of Generative Intelligent (GenAI) Large Language Models (LLMs) in assessing suicide risk relation to professionals. But considerations and factors that models use assess remain as black box. This study investigates if ChatGPT-3.5 ChatGPT-4 integrate cultural risks (probability suicidal ideation, potential for attempt, likelihood severe mortality from act) by vignette methodology. The vignettes examined were individuals Greece South Korea, representing countries with low high rates, respectively. contribution this research is examine assessment an international perspective, large language are expected provide culturally-tailored responses. However, there concern regarding biases racism, making crucial. In evaluation conducted via ChatGPT-4, only associated attempt act rated higher Korean characters than their Greek counterparts. Furthermore, within framework was male gender identified significant factor, leading heightened across all variables. ChatGPT exhibit sensitivity nuances. particular, offers increased reduced bias, highlighting importance differences assessment. findings suggest that, while demonstrates improved ability account gender-related assessment, areas enhancement, particularly ensuring comprehensive unbiased evaluations diverse populations. These results underscore GenAI aid culturally sensitive mental assessments, yet they also emphasize need ongoing refinement mitigate inherent enhance clinical utility.

Language: Английский

Citations

6

Large language models can enable inductive thematic analysis of a social media corpus in a single prompt: Human validation study (Preprint) DOI Creative Commons
Michael Deiner, Vlad Honcharov, Jiawei Li

et al.

JMIR Infodemiology, Journal Year: 2024, Volume and Issue: 4, P. e59641 - e59641

Published: July 1, 2024

Manually analyzing public health-related content from social media provides valuable insights into the beliefs, attitudes, and behaviors of individuals, shedding light on trends patterns that can inform understanding, policy decisions, targeted interventions, communication strategies. Unfortunately, time effort needed well-trained human subject matter experts makes extensive manual listening unfeasible. Generative large language models (LLMs) potentially summarize interpret amounts text, but it is unclear to what extent LLMs glean subtle meanings in sets posts reasonably report themes.

Language: Английский

Citations

4

Assessing the Accuracy of Artificial Intelligence Models in Scoliosis Classification and Suggested Therapeutic Approaches DOI Open Access
Artur Fabijan, Agnieszka Zawadzka-Fabijan,

Robert Fabijan

et al.

Journal of Clinical Medicine, Journal Year: 2024, Volume and Issue: 13(14), P. 4013 - 4013

Published: July 9, 2024

Background: Open-source artificial intelligence models (OSAIMs) are increasingly being applied in various fields, including IT and medicine, offering promising solutions for diagnostic therapeutic interventions. In response to the growing interest AI clinical diagnostics, we evaluated several OSAIMs—such as ChatGPT 4, Microsoft Copilot, Gemini, PopAi, You Chat, Claude, specialized PMC-LLaMA 13B—assessing their abilities classify scoliosis severity recommend treatments based on radiological descriptions from AP radiographs. Methods: Our study employed a two-stage methodology, where of single-curve were analyzed by following evaluation two independent neurosurgeons. Statistical analysis involved Shapiro–Wilk test normality, with non-normal distributions described using medians interquartile ranges. Inter-rater reliability was assessed Fleiss’ kappa, performance metrics, like accuracy, sensitivity, specificity, F1 scores, used evaluate systems’ classification accuracy. Results: The indicated that although some systems, accurately reflected recommended Cobb angle ranges disease treatment, others, such Gemini required further calibration. Particularly, 13B expanded range moderate scoliosis, potentially influencing decisions delaying Conclusions: These findings highlight need continuous refinement enhance applicability.

Language: Английский

Citations

3

Examining Artificial Intelligence Policies in Counsellor Education DOI Open Access
Laurie O. Campbell, Caitlin Frawley, Jessica L. Tinstman Jones

et al.

Counselling and Psychotherapy Research, Journal Year: 2025, Volume and Issue: 25(1)

Published: Jan. 2, 2025

ABSTRACT Aims This study investigated counsellor education Council for Accreditation of Counseling and Related Educational Programs (CACREP) programs generative artificial intelligence (AI) policies in doctoral‐level counselor programs. We aimed to contribute emerging research on the use AI within education. Methods A content analysis was conducted along with a linguistic determine authenticity, tone, analytical nature University, program policies. Results usage doctoral indicated that only five had program‐specific Most utilized University or guidance. Conclusion Suggestions practice include providing definitional clarity different types reduce potential frustration learners. Further, should consider developing policy since counseling profession requires high level ethical responsibility best serve clients.

Language: Английский

Citations

0

The Feasibility of Large Language Models in Verbal Comprehension Assessment: A Proof-of-Concept Study (Preprint) DOI Creative Commons
Dorit Hadar‐Shoval, Maya Lvovsky, Kfir Asraf

et al.

JMIR Formative Research, Journal Year: 2025, Volume and Issue: 9, P. e68347 - e68347

Published: Jan. 6, 2025

Cognitive assessment is an important component of applied psychology, but limited access and high costs make these evaluations challenging. This study aimed to examine the feasibility using large language models (LLMs) create personalized artificial intelligence-based verbal comprehension tests (AI-BVCTs) for assessing intelligence, in contrast with traditional methods based on standardized norms. We used a within-participants design, comparing scores obtained from AI-BVCTs those Wechsler Adult Intelligence Scale (WAIS-III) index (VCI). In total, 8 Hebrew-speaking participants completed both VCI AI-BVCT, latter being generated LLM Claude. The concordance correlation coefficient (CCC) demonstrated strong agreement between AI-BVCT (Claude: CCC=.75, 90% CI 0.266-0.933; GPT-4: CCC=.73, 0.170-0.935). Pearson correlations further supported findings, showing associations r=.84, P<.001; r=.77, P=.02). No statistically significant differences were found (P>.05). These findings support potential LLMs assess intelligence. attests promise AI-based cognitive increasing accessibility affordability processes, enabling testing. research also raises ethical concerns regarding privacy overreliance AI clinical work. Further larger more diverse samples needed establish validity reliability this approach develop accurate scoring procedures.

Language: Английский

Citations

0