Real-time semantic segmentation for autonomous driving: A review of CNNs, Transformers, and Beyond DOI Creative Commons
Mohammed A. M. Elhassan, Changjun Zhou, Ali Khan

и другие.

Journal of King Saud University - Computer and Information Sciences, Год журнала: 2024, Номер 36(10), С. 102226 - 102226

Опубликована: Ноя. 4, 2024

Язык: Английский

A survey on deep learning for polyp segmentation: techniques, challenges and future trends DOI Creative Commons

Jiaxin Mei,

Tao Zhou,

Kaiwen Huang

и другие.

Visual Intelligence, Год журнала: 2025, Номер 3(1)

Опубликована: Янв. 3, 2025

Язык: Английский

Процитировано

7

ViTs as backbones: Leveraging vision transformers for feature extraction DOI
Omar Elharrouss, Yassine Himeur, Yasir Mahmood

и другие.

Information Fusion, Год журнала: 2025, Номер unknown, С. 102951 - 102951

Опубликована: Янв. 1, 2025

Язык: Английский

Процитировано

2

Classification of Mobile-Based Oral Cancer Images Using the Vision Transformer and the Swin Transformer DOI Open Access
Bofan Song, Dharma Raj KC,

Rubin Yuchan Yang

и другие.

Cancers, Год журнала: 2024, Номер 16(5), С. 987 - 987

Опубликована: Фев. 29, 2024

Oral cancer, a pervasive and rapidly growing malignant disease, poses significant global health concern. Early accurate diagnosis is pivotal for improving patient outcomes. Automatic methods based on artificial intelligence have shown promising results in the oral cancer field, but accuracy still needs to be improved realistic diagnostic scenarios. Vision Transformers (ViT) outperformed learning CNN models recently many computer vision benchmark tasks. This study explores effectiveness of Transformer Swin Transformer, two cutting-edge variants transformer architecture, mobile-based image classification application. The pre-trained model achieved 88.7% binary task, outperforming ViT by 2.3%, while conventional convolutional network VGG19 ResNet50 85.2% 84.5% accuracy. Our experiments demonstrate that these transformer-based architectures outperform traditional neural networks terms classification, underscore potential advancing state art analysis.

Язык: Английский

Процитировано

16

Water body extraction from high spatial resolution remote sensing images based on enhanced U-Net and multi-scale information fusion DOI Creative Commons

Huidong Cao,

Yanbing Tian,

Yanli Liu

и другие.

Scientific Reports, Год журнала: 2024, Номер 14(1)

Опубликована: Июль 12, 2024

Abstract Employing deep learning techniques for the semantic segmentation of remote sensing images has emerged as a prevalent approach acquiring information about water bodies. Yet, current models frequently fall short in accurately extracting bodies from high-resolution images, these often present intricate details terrestrial objects and complex backgrounds. Vegetation, shadows, other close to boundaries have increased similarity Moreover, different boundary complexities, shapes, sizes. This situation makes it somewhat challenging distinguish images. To overcome difficulties, this paper presents novel network model named EU-Net, specifically designed extract The proposed EU-Net model, with U-net backbone network, incorporates improved residual connections attention mechanisms, designs multi-scale dilated convolution feature fusion modules enhance body extraction performance scenarios. Specifically, are introduced enable more features; mechanism is employed improve model's discriminative ability by focusing on important channels spatial areas. implemented technique enhances receptive field while maintaining same number parameters. module capable processing both small-scale large-scale structures simultaneously modeling context relationships features at scales. Experimental results validate superior identifying outperforming terms accuracy.

Язык: Английский

Процитировано

13

RGB-Angle-Wheel: A new data augmentation method for deep learning models DOI
Cüneyt Özdemir, Yahya Doğan, Yılmaz Kaya

и другие.

Knowledge-Based Systems, Год журнала: 2024, Номер 291, С. 111615 - 111615

Опубликована: Март 5, 2024

Язык: Английский

Процитировано

12

An efficient frequency domain fusion network of infrared and visible images DOI
Chenwu Wang, Junsheng Wu, Aiqing Fang

и другие.

Engineering Applications of Artificial Intelligence, Год журнала: 2024, Номер 133, С. 108013 - 108013

Опубликована: Фев. 5, 2024

Язык: Английский

Процитировано

10

Transformers for Neuroimage Segmentation: Scoping Review DOI Creative Commons
Maya Iratni, Ahmad Shahidan Abdullah,

Mariam Aldhaheri

и другие.

Journal of Medical Internet Research, Год журнала: 2025, Номер 27, С. e57723 - e57723

Опубликована: Янв. 29, 2025

Background Neuroimaging segmentation is increasingly important for diagnosing and planning treatments neurological diseases. Manual time-consuming, apart from being prone to human error variability. Transformers are a promising deep learning approach automated medical image segmentation. Objective This scoping review will synthesize current literature assess the use of various transformer models neuroimaging Methods A systematic search in major databases, including Scopus, IEEE Xplore, PubMed, ACM Digital Library, was carried out studies applying transformers problems 2019 through 2023. The inclusion criteria allow only peer-reviewed journal papers conference focused on transformer-based brain imaging data. Excluded dealing with nonneuroimaging data or raw signals electroencephalogram Data extraction performed identify key study details, modalities, datasets, conditions, models, evaluation metrics. Results were synthesized using narrative approach. Of 1246 publications identified, 67 (5.38%) met criteria. Half all included published 2022, more than two-thirds used segmenting tumors. most common modality magnetic resonance (n=59, 88.06%), while frequently dataset tumor (n=39, 58.21%). 3D (n=42, 62.69%) prevalent their 2D counterparts. developed those hybrid convolutional neural network-transformer architectures (n=57, 85.07%), where vision type (n=37, 55.22%). frequent metric Dice score (n=63, 94.03%). Studies generally reported increased accuracy ability model both local global features images. Conclusions represents recent increase adoption segmentation, particularly detection. Currently, achieve state-of-the-art performances benchmark datasets over standalone models. Nevertheless, applicability remains highly limited by high computational costs potential overfitting small datasets. heavy reliance field hints at diverse set validate variety Further research needed define optimal training methods clinical applications. Continuing development may make fast, accurate, reliable which could lead improved tools evaluating disorders.

Язык: Английский

Процитировано

2

Automated Foveal Avascular Zone Segmentation in Optical Coherence Tomography Angiography Across Multiple Eye Diseases Using Knowledge Distillation DOI Creative Commons

Peter Racioppo,

Aya Alhasany,

Nhu‐An Pham

и другие.

Bioengineering, Год журнала: 2025, Номер 12(4), С. 334 - 334

Опубликована: Март 23, 2025

Optical coherence tomography angiography (OCTA) is a noninvasive imaging technique used to visualize retinal blood flow and identify changes in vascular density enlargement or distortion of the foveal avascular zone (FAZ), which are indicators various eye diseases. Although several automated FAZ detection segmentation algorithms have been developed for use with OCTA, their performance can vary significantly due differences data accessibility OCTA different pathologies, image quality subjects and/or devices. For example, from direct macular damage, such as age-related degeneration (AMD), more readily available clinics, while on damage systemic diseases like Alzheimer’s disease often less accessible; healthy may better than ophthalmic pathologies. Typically, make convolutional neural networks and, recently, vision transformers, both long-range context fine-grained detail. However, transformers known be data-hungry, overfit small datasets, those common there limited access clinical practice. To improve model generalization low-data imbalanced settings, we propose multi-condition transformer-based architecture that uses four teacher encoders distill knowledge into shared base model, enabling transfer learned features across multiple datasets. These include intra-modality distillation using datasets ocular conditions: aging eyes, disease, AMD, diabetic retinopathy; inter-modality incorporating color fundus photographs undergoing laser photocoagulation therapy. Our achieved mean Dice Index 83.8% pretraining, outperforming single-condition models (mean 83.1%) all conditions. Pretraining images improved average by margin conditions except AMD (1.1% models, 0.1% models). demonstrates potential broader applications detecting analyzing diverse settings.

Язык: Английский

Процитировано

1

Development and challenges of object detection: A survey DOI
Zonghui Li, Yongsheng Dong,

Longchao Shen

и другие.

Neurocomputing, Год журнала: 2024, Номер 598, С. 128102 - 128102

Опубликована: Июнь 22, 2024

Язык: Английский

Процитировано

8

Efficient and robust phase unwrapping method based on SFNet DOI Creative Commons
Ziheng Zhang, Xiaoxu Wang, Chengxiu Liu

и другие.

Optics Express, Год журнала: 2024, Номер 32(9), С. 15410 - 15410

Опубликована: Март 21, 2024

Phase unwrapping is a crucial step in obtaining the final physical information field of optical metrology. Although good at dealing with phase discontinuity and noise, most deep learning-based spatial methods suffer from complex model unsatisfactory performance, partially due to simple noise type for training datasets limited interpretability. This paper proposes highly efficient robust method based on an improved SegFormer network, SFNet. The SFNet structure uses hierarchical encoder without positional encoding decoder lightweight fully connected multilayer perceptron. proposed utilizes self-attention mechanism Transformer better capture global relationship changes reduce errors process. It has lower parameter count, speeding up unwrapping. network trained simulated dataset containing various types discontinuity. compares several state-of-the-art traditional terms important evaluation indices, such as RMSE PFS, highlighting its structural stability, robustness generalization.

Язык: Английский

Процитировано

7