
Big Data and Cognitive Computing, Год журнала: 2025, Номер 9(6), С. 149 - 149
Опубликована: Июнь 3, 2025
Our study investigates how the sequencing of text and image inputs within multi-modal prompts affects reasoning performance Large Language Models (LLMs). Through empirical evaluations three major commercial LLM vendors—OpenAI, Google, Anthropic—alongside a user on interaction strategies, we develop validate practical heuristics for optimising prompt design. findings reveal that modality is critical factor influencing performance, particularly in tasks with varying cognitive load structural complexity. For simpler involving single image, positioning modalities directly impacts model accuracy, whereas complex, multi-step scenarios, sequence must align logical structure inference, often outweighing specific placement individual modalities. Furthermore, identify systematic challenges multi-hop transformer-based architectures, where models demonstrate strong early-stage inference but struggle integrating prior contextual information later steps. Building these insights, propose set validated, user-centred designing effective prompts, enhancing both accuracy AI systems. contributions inform design usability interactive intelligent systems, implications applications education, medical imaging, legal document analysis, customer support. By bridging gap between system behaviour this provides actionable guidance users can effectively to optimise real-world, high-stakes decision-making contexts.
Язык: Английский