Опубликована: Ноя. 22, 2024
Язык: Английский
Опубликована: Ноя. 22, 2024
Язык: Английский
Advanced Engineering Informatics, Год журнала: 2025, Номер 65, С. 103265 - 103265
Опубликована: Март 23, 2025
Язык: Английский
Процитировано
3Nano Energy, Год журнала: 2025, Номер 138, С. 110821 - 110821
Опубликована: Март 5, 2025
Язык: Английский
Процитировано
1Sensors, Год журнала: 2025, Номер 25(7), С. 2112 - 2112
Опубликована: Март 27, 2025
Building extraction plays a pivotal role in enabling rapid and accurate construction of urban maps, thereby supporting planning, smart city development, management. Buildings remote sensing imagery exhibit diverse morphological attributes spectral signatures, yet their reliable interpretation through single-modal data remains constrained by heterogeneous terrain conditions, occlusions, spatially variable illumination effects inherent to complex geographical landscapes. The integration multi-modal for building offers significant advantages leveraging complementary features from sources. However, the heterogeneity complicates effective feature extraction, while multi-scale cross-modal fusion encounters semantic gap issue. To address these challenges, novel network based on called SDA-les (AGAFMs) was designed decoding stage fuse at various scales, which dynamically adjust importance global perspective better balance information. superior performance proposed method is demonstrated comprehensive evaluations ISPRS Potsdam dataset with 97.66% F1 score 95.42% IoU, Vaihingen 96.56% 93.35% DFC23 Track2 91.35% 84.08% IoU.
Язык: Английский
Процитировано
0Journal of Cleaner Production, Год журнала: 2025, Номер unknown, С. 145547 - 145547
Опубликована: Апрель 1, 2025
Язык: Английский
Процитировано
0Information Fusion, Год журнала: 2025, Номер unknown, С. 103222 - 103222
Опубликована: Апрель 1, 2025
Язык: Английский
Процитировано
0Buildings, Год журнала: 2025, Номер 15(9), С. 1552 - 1552
Опубликована: Май 4, 2025
Parks are an important component of urban ecosystems, yet traditional research often relies on single-modal data, such as text or images alone, making it difficult to comprehensively and accurately capture the complex emotional experiences visitors their relationships with environment. This study proposes a park perception understanding model based multimodal text–image data bidirectional attention mechanism. By integrating image incorporates encoder representations from transformers (BERT)-based feature extraction module, Swin Transformer-based cross-attention fusion enabling more precise assessment visitors’ in parks. Experimental results show that compared methods residual network (ResNet), recurrent neural (RNN), long short-term memory (LSTM), proposed achieves significant advantages across multiple evaluation metrics, including mean squared error (MSE), absolute (MAE), root (RMSE), coefficient determination (R2). Furthermore, using SHapley Additive exPlanations (SHAP) method, this identified key factors influencing experiences, “water”, “green”, “sky”, providing scientific basis for management optimization.
Язык: Английский
Процитировано
0European Journal of Computer Science and Information Technology, Год журнала: 2025, Номер 13(26), С. 76 - 90
Опубликована: Апрель 15, 2025
Sensor fusion and multi-modal perception have evolved beyond simple data combination into dynamic, context-aware systems that fundamentally transform how robots understand their environment. Modern autonomous now actively adapt sensing strategies based on environmental conditions, sensor health, task requirements. By integrating from cameras, LiDAR, radar, inertial measurement units, these achieve robust performance even when individual sensors encounter worst-case scenarios. The evolution of deep learning-based architectures addresses critical challenges in temporal synchronization, drift compensation, adaptation through dynamic weighting real-time calibration adjustment. Through edge computing distributed processing, innovations enable reliable operation across industrial automation, navigation, object tracking applications. shift static to represents a crucial advance making practical for real-world deployment.
Язык: Английский
Процитировано
0Computers & Electrical Engineering, Год журнала: 2024, Номер 121, С. 109863 - 109863
Опубликована: Ноя. 23, 2024
Язык: Английский
Процитировано
2Applied Mathematics and Nonlinear Sciences, Год журнала: 2024, Номер 9(1)
Опубликована: Янв. 1, 2024
Abstract Through the integration of multimodal data fusion technology and computer AI technology, people’s needs for intelligent life can be better met. This paper introduces alignment perception algorithm fusion, which is based on combining model. Taking air pollutant concentration prediction as an example, time series obtained through LSTM model prediction, attention mechanism introduced to establish numerical pollution. Different stations are also selected acquire weather image data, TS-Conv-LSTM spatio-temporal quality images constructed by utilizing Conv-LSTM cell encoder, then TransConv-LSTM cell, integrates anti-convolution long-short-term memory network a decoder. The Gaussian regression was used combine models, thus achieving synergistic concentrations. RMSE ATT-LSTM dataset reduced 8.03 compared comparison model, predictive fit above 0.75 all R² values. lowest MAE value collaborative only 3.815, highest up 0.985. Introducing deep learning techniques into helps explore massive more deeply obtain comprehensive reliable information about it.
Язык: Английский
Процитировано
0Опубликована: Ноя. 22, 2024
Язык: Английский
Процитировано
0