XtremeLLMs: Towards Extremely Large Language Models DOI Open Access
Ibomoiye Domor Mienye, Theo G. Swart, George Obaido

et al.

Published: Aug. 21, 2024

The continuous expansion of Large Language Models (LLMs) has significantly transformed the fields artificial intelligence (AI) and natural language processing (NLP). This paper reviews rapidly evolving domain models introduces concept Extremely (XtremeLLMs), a new category defined for exceeding one trillion parameters. These are monumental in scale engineered to enhance performance across diverse range tasks. study aims establish comprehensive framework that explores significant opportunities complex challenges presented by such extensive scaling emphasises implications future advancements field.

Language: Английский

Secure Hierarchical Federated Learning for Large-Scale AI Models: Poisoning Attack Defense and Privacy Preservation in AIoT DOI Open Access
Chengzhuo Han, Tingting Yang, Xin Sun

et al.

Electronics, Journal Year: 2025, Volume and Issue: 14(8), P. 1611 - 1611

Published: April 16, 2025

The rapid integration of large-scale AI models into distributed systems, such as the Artificial Intelligence Things (AIoT), has introduced critical security and privacy challenges. While configurable enhance resource efficiency, their deployment in heterogeneous edge environments remains vulnerable to poisoning attacks, data leakage, adversarial interference, threatening integrity collaborative learning responsible deployment. To address these issues, this paper proposes a Hierarchical Federated Cross-domain Retrieval (FHCR) framework tailored for secure privacy-preserving AIoT systems. By decoupling shared retrieval layer (globally optimized via federated learning) device-specific layers (locally personalized), FHCR minimizes communication overhead while enabling dynamic module selection. Crucially, we integrate retrieval-layer mean inspection (RLMI) mechanism detect filter malicious gradient updates, effectively mitigating attacks reducing attack success rates by 20% compared conventional methods. Extensive evaluation on General-QA IoT-Native datasets demonstrates robustness against threats, with maintaining global accuracy not lower than baseline levels costs 14%.

Language: Английский

Citations

0

RSTC: Residual Swin Transformer Cascade to approximate Taylor expansion for image denoising DOI
Jin Liu, Yang Yang, Biyun Xu

et al.

Computer Vision and Image Understanding, Journal Year: 2024, Volume and Issue: 248, P. 104132 - 104132

Published: Aug. 23, 2024

Language: Английский

Citations

2

Machine Learning in Society: Prospects, Risks, and Benefits DOI
Mirko Farina, Witold Pedrycz

Philosophy & Technology, Journal Year: 2024, Volume and Issue: 37(3)

Published: July 31, 2024

Language: Английский

Citations

0

A novel iteration scheme with conjugate gradient for faster pruning on transformer models DOI Creative Commons
Jun Li, Yuchen Zhu, Kexue Sun

et al.

Complex & Intelligent Systems, Journal Year: 2024, Volume and Issue: 10(6), P. 7863 - 7875

Published: Aug. 7, 2024

Pre-trained models based on the Transformer architecture have significantly advanced research within domain of Natural Language Processing (NLP) due to their superior performance and extensive applicability across multiple technological sectors. Despite these advantages, there is a significant challenge in optimizing for more efficient deployment. To be concrete, existing post-training pruning frameworks transformer suffer from inefficiencies crucial stage accuracy recovery, which impacts overall efficiency. address this issue, paper introduces novel iteration scheme with conjugate gradient recovery stage. By constructing series iterative directions, approach ensures each optimization step orthogonal previous ones, effectively reduces redundant explorations search space. Consequently, progresses towards global optimum, thereby enhancing The gradient-based faster-pruner time expenditure process while maintaining accuracy, demonstrating high degree solution stability exceptional model acceleration effects. In experiments conducted BERTBASE DistilBERT models, exhibited outstanding GLUE benchmark dataset, achieving reduction up 36.27% speed increase 1.45× an RTX 3090 GPU.

Language: Английский

Citations

0

Preserving Real-World Robustness of Neural Networks Under Sparsity Constraints DOI

Jasmin Viktoria Gritsch,

Robert Legenstein, Ozan Özdenizci

et al.

Lecture notes in computer science, Journal Year: 2024, Volume and Issue: unknown, P. 337 - 354

Published: Jan. 1, 2024

Language: Английский

Citations

0

Thin Cloud Removal Generative Adversarial Network Based on Sparse Transformer in Remote Sensing Images DOI Creative Commons

Junggon Han,

Ying Zhou,

Xindan Gao

et al.

Remote Sensing, Journal Year: 2024, Volume and Issue: 16(19), P. 3658 - 3658

Published: Sept. 30, 2024

Thin clouds in Remote Sensing (RS) imagery can negatively impact subsequent applications. Current Deep Learning (DL) approaches often prioritize information recovery cloud-covered areas but may not adequately preserve cloud-free regions, leading to color distortion, detail loss, and visual artifacts. This study proposes a Sparse Transformer-based Generative Adversarial Network (SpT-GAN) solve these problems. First, global enhancement feature extraction module is added the generator’s top layer enhance model’s ability ground areas. Then, processed map reconstructed using sparse transformer-based encoder decoder with an adaptive threshold filtering mechanism ensure sparsity. enables that model preserves robust long-range modeling capabilities while disregarding irrelevant details. In addition, inverted residual Fourier transformation blocks are at each level of structure filter redundant quality generated images. Finally, composite loss function created minimize error images, resulting improved resolution fidelity. SpT-GAN achieves outstanding results removing both quantitatively visually, Structural Similarity Index (SSIM) values 98.06% 92.19% Peak Signal-to-Noise Ratio (PSNR) 36.19 dB 30.53 on RICE1 T-Cloud datasets, respectively. On dataset, especially more complex cloud components, superior restore details evident.

Language: Английский

Citations

0

A New Attention-based Method For Estimating Li-ion Battery State-of-Charge DOI
Ahmed Abdulmaksoud, Mohanad Ismail, John Guirguis

et al.

Published: June 19, 2024

Language: Английский

Citations

0

XtremeLLMs: Towards Extremely Large Language Models DOI Open Access
Ibomoiye Domor Mienye, Theo G. Swart, George Obaido

et al.

Published: Aug. 21, 2024

The continuous expansion of Large Language Models (LLMs) has significantly transformed the fields artificial intelligence (AI) and natural language processing (NLP). This paper reviews rapidly evolving domain models introduces concept Extremely (XtremeLLMs), a new category defined for exceeding one trillion parameters. These are monumental in scale engineered to enhance performance across diverse range tasks. study aims establish comprehensive framework that explores significant opportunities complex challenges presented by such extensive scaling emphasises implications future advancements field.

Language: Английский

Citations

0