Multimedia Systems, Год журнала: 2025, Номер 31(3)
Опубликована: Май 2, 2025
Язык: Английский
Multimedia Systems, Год журнала: 2025, Номер 31(3)
Опубликована: Май 2, 2025
Язык: Английский
Scientific Reports, Год журнала: 2025, Номер 15(1)
Опубликована: Фев. 14, 2025
The classification of remote sensing images is inherently challenging due to the complexity, diversity, and sparsity data across different image samples. Existing advanced methods often require substantial modifications model architectures achieve optimal performance, resulting in complex frameworks that are difficult adapt. To overcome these limitations, we propose a lightweight ensemble method, enhanced by pure correction, called Exceptionally Straightforward Ensemble. This approach eliminates need for extensive structural models. A key innovation our method introduction novel strategy, quantitative augmentation, implemented through plug-and-play module. strategy effectively corrects feature distributions data, significantly improving performance Convolutional Neural Networks Vision Transformers beyond traditional augmentation techniques. Furthermore, straightforward algorithm generate an network composed two components, serving as proposed classifier. We evaluate on three well-known datasets, with results demonstrating models outperform 48 state-of-the-art published since 2020, excelling accuracy, inference speed, compactness. Specifically, overall accuracy up 96.8%, representing 1.1% improvement NWPU45 dataset. Moreover, smallest reduces parameters 90% time 74%. Notably, enhances Transformers, even limited training thus alleviating dependence large-scale datasets. In summary, data-driven offers efficient, accessible solution classification, providing elegant alternative researchers geoscience fields who may have or resources optimization.
Язык: Английский
Процитировано
2The Photogrammetric Record, Год журнала: 2025, Номер 40(189)
Опубликована: Янв. 1, 2025
ABSTRACT Existing Vision Transformer (ViT)‐based object detection methods for remote sensing images (RSIs) face significant challenges due to the scarcity of RSI samples and over‐reliance on enhancement strategies originally developed natural images. This often leads inconsistent data distributions between training testing subsets, resulting in degraded model performance. In this study, we introduce an optimized distribution learning (ODDL) strategy develop framework based Faster R‐CNN architecture, named ODDL‐Net. The ODDL begins with augmentation (OA) technique, overcoming limitations conventional methods. Next, propose mosaic algorithm (OMA), improving upon shortcomings traditional Mosaic techniques. Additionally, a feature fusion regularization (FFR) method, addressing inherent classic pyramid networks. These innovations are integrated into three modular, plug‐and‐play components—namely, OA, OMA, FFR modules—ensuring that can be seamlessly incorporated existing frameworks without requiring modifications. To evaluate effectiveness proposed ODDL‐Net, two variants different ViT architectures: Next (NViT) small Swin (SwinT) tiny model, both used as backbones. Experimental results NWPU10, DIOR20, MAR20, GLH‐Bridge datasets demonstrate ODDL‐Net achieve impressive accuracy, surpassing 23 state‐of‐the‐art introduced since 2023. Specifically, ODDL‐Net‐NViT attained accuracies 78.3% challenging DIOR20 dataset 61.4% dataset. Notably, represents substantial improvement approximately 23% over R‐CNN‐ResNet50 baseline conclusion, study demonstrates ViTs well suited high‐accuracy RSIs. Furthermore, it provides straightforward solution building ViT‐based detectors, offering practical approach requires little modification.
Язык: Английский
Процитировано
1International Journal of Intelligent Computing and Cybernetics, Год журнала: 2024, Номер 18(1), С. 133 - 152
Опубликована: Ноя. 13, 2024
Purpose Vision transformers (ViT) detectors excel in processing natural images. However, when remote sensing images (RSIs), ViT methods generally exhibit inferior accuracy compared to approaches based on convolutional neural networks (CNNs). Recently, researchers have proposed various structural optimization strategies enhance the performance of detectors, but progress has been insignificant. We contend that frequent scarcity RSI samples is primary cause this problem, and model modifications alone cannot solve it. Design/methodology/approach To address this, we introduce a faster RCNN-based approach, termed QAGA-Net, which significantly enhances recognition. Initially, propose novel quantitative augmentation learning (QAL) strategy sparse data distribution RSIs. This integrated as QAL module, plug-and-play component active exclusively during model’s training phase. Subsequently, enhanced feature pyramid network (FPN) by introducing two efficient modules: global attention (GA) module long-range dependencies multi-scale information fusion, an pooling (EP) optimize capability understand both high low frequency information. Importantly, QAGA-Net compact size achieves balance between computational efficiency accuracy. Findings verified using different models detector’s backbone. Extensive experiments NWPU-10 DIOR20 datasets demonstrate superior 23 other or CNN literature. Specifically, shows increase mAP 2.1% 2.6% challenging dataset top-ranked respectively. Originality/value paper highlights impact detection performance. fundamentally data-driven approach: module. Additionally, introduced modules FPN. More importantly, our potential collaborate with method does not require any
Язык: Английский
Процитировано
8Machine Vision and Applications, Год журнала: 2025, Номер 36(3)
Опубликована: Апрель 22, 2025
Язык: Английский
Процитировано
0Multimedia Systems, Год журнала: 2025, Номер 31(3)
Опубликована: Май 2, 2025
Язык: Английский
Процитировано
0