Cross-Lingual Clustering Using Large Language Models DOI
Nicole R. Schneider, Avik Das, Kent O’Sullivan

et al.

Published: Oct. 29, 2024

Language: Английский

Leveraging Multimodal Large Language Models (MLLMs) for Enhanced Object Detection and Scene Understanding in Thermal Images for Autonomous Driving Systems DOI Creative Commons
Huthaifa I. Ashqar,

Taqwa I. Alhadidi,

Mohammed Elhenawy

et al.

Automation, Journal Year: 2024, Volume and Issue: 5(4), P. 508 - 526

Published: Oct. 10, 2024

The integration of thermal imaging data with multimodal large language models (MLLMs) offers promising advancements for enhancing the safety and functionality autonomous driving systems (ADS) intelligent transportation (ITS). This study investigates potential MLLMs, specifically GPT-4 Vision Preview Gemini 1.0 Pro Vision, interpreting images applications in ADS ITS. Two primary research questions are addressed: capacity these to detect enumerate objects within images, determine whether pairs image sources represent same scene. Furthermore, we propose a framework object detection classification by integrating infrared (IR) RGB scene without requiring localization data. is particularly valuable accuracy environments where both IR cameras essential. By employing zero-shot in-context learning chain-of-thought technique discernment, this demonstrates that MLLMs can recognize such as vehicles individuals results, even challenging domain imaging. results indicate high true positive rate larger moderate success recall 0.91 precision 0.79 similar scenes. further enhances capabilities, achieving an average 0.93 0.56. approach leverages complementary strengths each modality compensate individual limitations. highlights combining advanced AI methodologies enhance reliability ADS, while identifying areas improvement model performance.

Language: Английский

Citations

5

A Cross-Cultural Crash Pattern Analysis in the United States and Jordan Using BERT and SHAP DOI Open Access
Shadi Jaradat, Mohammed Elhenawy, Alexander Paz

et al.

Electronics, Journal Year: 2025, Volume and Issue: 14(2), P. 272 - 272

Published: Jan. 10, 2025

Understanding the cultural and environmental influences on roadway crash patterns is essential for designing effective prevention strategies. This study applies advanced AI techniques, including Bidirectional Encoder Representations from Transformers (BERT) Shapley Additive Explanations (SHAP), to examine traffic in United States Jordan. By analyzing tabular data narratives, research reveals significant regional differences: USA, vehicle overturns conditions, such as guardrails, are major factors fatal crashes, whereas Jordan, technical defects driver behavior play a more critical role. SHAP analysis identifies “driver” “damage” pivotal terms across both regions, while country-specific “overturn” USA “technical” Jordan highlight disparities. Using BERT/Bi-LSTM models, achieves up 99.5% accuracy severity prediction, demonstrating robustness of safety analysis. These findings underscore value contextualized AI-driven insights developing targeted, region-specific road policies interventions. bridging gap between developed country contexts, contributes global effort reduce injuries fatalities.

Language: Английский

Citations

0

Large Language Models (LLMs) as Traffic Control Systems at Urban Intersections: A New Paradigm DOI Creative Commons
Sari Masri, Huthaifa I. Ashqar, Mohammed Elhenawy

et al.

Vehicles, Journal Year: 2025, Volume and Issue: 7(1), P. 11 - 11

Published: Jan. 27, 2025

This study introduces a novel approach for traffic control systems by using Large Language Models (LLMs) as controllers. The utilizes their logical reasoning, scene understanding, and decision-making capabilities to optimize throughput provide feedback based on conditions in real time. LLMs centralize traditionally disconnected processes can integrate data from diverse sources context-aware decisions. also deliver tailored outputs various means such wireless signals visuals drivers, infrastructures, autonomous vehicles. To evaluate LLMs’ ability controllers, this proposed four-stage methodology. methodology includes creation environment initialization, prompt engineering, conflict identification, fine-tuning. We simulated multi-lane four-leg intersection scenarios generated detailed datasets enable detection Python simulation ground truth. used chain-of-thought prompts lead understanding the context, detecting conflicts, resolving them rules, delivering context-sensitive management solutions. evaluated performance of GPT-4o-mini, Gemini, Llama Results showed that fine-tuned GPT-mini achieved 83% accuracy an F1-score 0.84. GPT-4o-mini model exhibited promising generating actionable insights, with high ROUGE-L scores across identification 0.95, decision making 0.91, priority assignment 0.94, waiting time optimization 0.92. confirmed benefits controller real-world applications. demonstrated offer precise recommendations drivers including yielding, slowing, or stopping vehicle dynamics. demonstrates transformative potential control, enhancing efficiency safety at intersections.

Language: Английский

Citations

0

Retrieval Augmented Generation-Aided causal identification of aviation Accidents: A large language model Methodology DOI
Tengfei Ren, Zhipeng Zhang, Bo Jia

et al.

Expert Systems with Applications, Journal Year: 2025, Volume and Issue: unknown, P. 127306 - 127306

Published: March 1, 2025

Language: Английский

Citations

0

Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models DOI Creative Commons
Shadi Jaradat, Mohammed Elhenawy, Richi Nayak

et al.

AI, Journal Year: 2025, Volume and Issue: 6(4), P. 72 - 72

Published: April 7, 2025

In traffic safety analysis, previous research has often focused on tabular data or textual crash narratives in isolation, neglecting the potential benefits of a hybrid multimodal approach. This study introduces Multimodal Data Fusion (MDF) framework, which fuses with by leveraging advanced Large Language Models (LLMs), such as GPT-2, GPT-3.5, and GPT-4.5, using zero-shot (ZS), few-shot (FS), fine-tuning (FT) learning strategies. We employed GPT-4.5 to generate new labels for driver fault, actions, factors, alongside existing label severity. Our methodology was tested from Missouri State Highway Patrol, demonstrating significant improvements model performance. GPT-2 (fine-tuned) used baseline model, against more models were evaluated. achieved 98.9% accuracy severity prediction 98.1% fault classification. factor extraction, highest Jaccard score (82.9%), surpassing GPT-3.5 fine-tuned models. Similarly, actions attained 73.1%, while closely followed 72.2%, that task-specific can achieve performance close state-of-the-art when adapted domain-specific data. These findings highlight superior learning, particularly classification information extraction tasks, also underscoring effectiveness datasets bridge gaps The MDF framework’s success demonstrates its broader applications beyond domains where labeled are scarce predictive modeling is essential.

Language: Английский

Citations

0

Cross-Lingual Clustering Using Large Language Models DOI
Nicole R. Schneider, Avik Das, Kent O’Sullivan

et al.

Published: Oct. 29, 2024

Language: Английский

Citations

0