Model Selection for HERITAGE-AI: Evaluating LLMs for Contextual Data Analysis of Maryland’s Domestic Traffic Ads (1824–1864) DOI

Rajesh Kumar Gnanasekaran,

Lori Perine,

Mark F. Conrad

et al.

2021 IEEE International Conference on Big Data (Big Data), Journal Year: 2024, Volume and Issue: unknown, P. 2419 - 2430

Published: Dec. 15, 2024

Language: Английский

Efficient Conceptual Knowledge Removal in Large Language Models: Methods and Evaluations DOI Creative Commons

Miyim Dimitriou,

Daniel Rogowski,

Michael C. Anderson

et al.

Research Square (Research Square), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 8, 2024

Abstract The increasing use of deep neural networks has led to models that accumulate vast amounts knowledge from their training data, often retaining outdated or biased information needs be selectively removed. Novel techniques are required efficiently erase specific conceptual these while maintaining overall performance and avoiding computationally expensive re-training processes. This paper introduces a scalable framework for removal through targeted weight modification sparse fine-tuning, demonstrating how representations can isolated erased without significant degradation the model's broader capabilities. methodology achieves high precision in suppression by leveraging probing gradient-based optimization, ensuring minimal disruption general task performance. Extensive experimental evaluations confirm effectiveness proposed approach, highlighting its application scenarios where adaptive model refinement is essential both accuracy ethical integrity. Contributions field include development flexible efficient mechanism erasure, applicable across various architectures, minimizes computational overhead enhancing responsiveness dynamic requirements.

Language: Английский

Citations

2

Assessing Reasoning Capabilities of Commercial LLMs: A Comparative Study of Inductive and Deductive Tasks DOI

Rowena Witali,

Quentin Latrese,

Giles Ravenscroft

et al.

Authorea (Authorea), Journal Year: 2024, Volume and Issue: unknown

Published: Aug. 6, 2024

Artificial intelligence has revolutionized various fields through its ability to process and generate human-like text, leading significant advancements in tasks requiring language comprehension generation. However, the evaluation of fundamental reasoning abilities within commercial LLMs, specifically inductive deductive reasoning, remains crucial for understanding their cognitive capabilities limitations. This research provides a comprehensive assessment ChatGPT, Gemini, Claude, using meticulously designed set evaluate performance. The methodology involved selection diverse datasets, design complex tasks, implementation robust automated testing framework. Statistical analyses, including ANOVA regression techniques, were employed rigorously compare models’ performance across different tasks. Results indicated that ChatGPT consistently outperformed other models, particularly excelling high precision recall, while Gemini Claude exhibited variability capabilities. study highlights strengths weaknesses each model, offering insights into relative potential areas improvement. Implications AI development are significant, emphasizing need tailored model designs continued innovation training techniques enhance abilities. contributes broader providing foundation future developing more capable reliable intelligent systems.

Language: Английский

Citations

0

Enhancements to Large Language Models: Introducing Dynamic Syntactic Insertion for Improved Model Robustness and Generalization DOI

Elena Tremaskina,

Santiago Deluca,

Christopher M. Thompson

et al.

Authorea (Authorea), Journal Year: 2024, Volume and Issue: unknown

Published: Oct. 14, 2024

The growing complexity and scale of modern deep learning models have improved the ability to generate understand human language, yet challenges persist in achieving robust generalization syntactic flexibility.Dynamic Syntactic Insertion (DSI) addresses these limitations through novel introduction random variations during finetuning phase, enhancing model's capacity process diverse linguistic structures.Through empirical experiments on GPT-NeoX architecture, significant performance improvements were observed across multiple metrics, including robustness, fluency, accuracy.The DSI-enhanced model consistently outperformed baseline, particularly handling syntactically complex perturbed datasets, demonstrating its adaptability a broader range inputs.Furthermore, incorporation variability led reductions perplexity increased tasks GLUE benchmark, highlighting method's effectiveness.The findings from this study suggest that augmentation techniques, such as DSI, provide promising pathway for improving resilience language environments.

Language: Английский

Citations

0

Model Selection for HERITAGE-AI: Evaluating LLMs for Contextual Data Analysis of Maryland’s Domestic Traffic Ads (1824–1864) DOI

Rajesh Kumar Gnanasekaran,

Lori Perine,

Mark F. Conrad

et al.

2021 IEEE International Conference on Big Data (Big Data), Journal Year: 2024, Volume and Issue: unknown, P. 2419 - 2430

Published: Dec. 15, 2024

Language: Английский

Citations

0