FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking DOI Creative Commons
Huan Liu, Detian Huang,

Mingxin Lin

et al.

Applied Sciences, Journal Year: 2024, Volume and Issue: 14(22), P. 10589 - 10589

Published: Nov. 17, 2024

Visual object tracking is a fundamental task in computer vision, with applications ranging from video surveillance to autonomous driving. Despite recent advances transformer-based one-stream trackers, unrestricted feature interactions between the template and search region often introduce background noise into template, degrading performance. To address this issue, we propose FETrack, feature-enhanced network for visual tracking. Specifically, incorporate an independent stream encoder of tracker acquire high-quality features while suppressing harmful effectively. Then, employ sequence-learning-based causal transformer decoder generate bounding box autoregressively, simplifying prediction head network. Further, present dynamic threshold-based online template-updating strategy template-filtering approach boost robustness reduce redundant computations. Extensive experiments demonstrate that our FETrack achieves superior performance over state-of-the-art trackers. proposed 75.1% AO on GOT-10k, 81.2% AUC LaSOT, 89.3% Pnorm TrackingNet.

Language: Английский

High-Accuracy Intermittent Strabismus Screening via Wearable Eye-Tracking and AI-Enhanced Ocular Feature Analysis DOI Creative Commons

Zihe Zhao,

Hongbei Meng,

Shangru Li

et al.

Biosensors, Journal Year: 2025, Volume and Issue: 15(2), P. 110 - 110

Published: Feb. 14, 2025

An effective and highly accurate strabismus screening method is expected to identify potential patients provide timely treatment prevent further deterioration, such as amblyopia even permanent vision loss. To satisfy this need, work showcases a novel based on wearable eye-tracking device combined with an artificial intelligence (AI) algorithm. the minor occasional inconsistencies in during binocular coordination process, which are usually seen early-stage rarely recognized current studies, system captures temporally spatially continuous high-definition infrared images of eye wide-angle motion, inducing intermittent strabismus. Based collected motion information, 16 features oculomotor process strong physiological interpretations, help biomedical staff understand evaluate results generated later, calculated through introduction pupil-canthus vectors. These can be normalized, reflect individual differences. After these processed by random forest (RF) algorithm, experimentally yields 97.1% accuracy detection 70 people under diverse indoor testing conditions, validating high robustness method, implying that has support widespread screening.

Language: Английский

Citations

0

Prediction of Radiological Diagnostic Errors from Eye Tracking Data Using Graph Neural Networks and Gaze-Guided Transformers DOI
Anna Anikina,

Reza Karimzadeh,

Д. З. Ибрагимова

et al.

Lecture notes in computer science, Journal Year: 2025, Volume and Issue: unknown, P. 33 - 42

Published: Jan. 1, 2025

Language: Английский

Citations

0

Enhancing colorectal polyp classification using gaze-based attention networks DOI Creative Commons
Zhenghao Guo,

Yanyan Hu,

Peixuan Ge

et al.

PeerJ Computer Science, Journal Year: 2025, Volume and Issue: 11, P. e2780 - e2780

Published: March 25, 2025

Colorectal polyps are potential precursor lesions of colorectal cancer. Accurate classification during endoscopy is crucial for early diagnosis and effective treatment. Automatic accurate based on convolutional neural networks (CNNs) vital assisting endoscopists in However, this task remains challenging due to difficulties the data acquisition annotation processes, poor interpretability output, lack widespread acceptance CNN models by clinicians. This study proposes an innovative approach that utilizes gaze attention information from as auxiliary supervisory signal train a CNN-based model polyps. Gaze reading endoscopic images was first recorded through eye-tracker. Then, processed applied supervise model’s via consistency module . Comprehensive experiments were conducted dataset contained three types The results showed EfficientNet_b1 with supervised achieved overall test accuracy 86.96%, precision 87.92%, recall 88.41%, F1 score 88.16%, area under receiver operating characteristic (ROC) curve (AUC) 0.9022. All evaluation metrics surpassed those without supervision. class activation maps generated proposed network also indicate endoscopist’s gaze-attention information, prior knowledge, increases polyp classification, offering new solution field medical image analysis.

Language: Английский

Citations

0

Interactively Assisting Glaucoma Diagnosis with an Expert Knowledge-Distilled Vision Transformer DOI
Z. Li, Haowen Wei,

Kang Sun

et al.

Published: April 23, 2025

Language: Английский

Citations

0

Gaze-Informed Vision Transformers: Predicting Driving Decisions Under Uncertainty DOI

Sharath Koorathota,

Νικόλας Παπαδόπουλος,

Jia Li

et al.

Published: Oct. 30, 2024

Language: Английский

Citations

1

Automated Insight Tool: Analyzing Eye Tracking Data of Expert and Novice Radiologists During Optic Disc Detection Task DOI
Aiswariya Milan K, J. Amudha, Gheorghiţă Ghinea

et al.

Published: May 31, 2024

In specialized medical research, eye-tracking analysis proves instrumental for monitoring and dissecting eye movements gaze patterns. This methodology is employed to get insights into diverse human behavior, cognition, health facets. study analyzes radiologists patterns as they locate optic discs in retinal fundus images. The underlying premise that experts possess a more profound understanding of the designated area interest (AOI) within images than non-experts. We considered visual attention distribution, focus shift between novice during viewing task. Identifying expert will benefit two main ways: first, it facilitate development an effective training system assists recognizing their technical proficiency, second, enable automation diagnostic process. proposed utilizes from Region Interest (IOR) interpret classify expertise levels, compare among different levels. introduces automated tool analyze behaviors regions interest, offering comprehensive radiological processes.

Language: Английский

Citations

0

A Proposed Method of Automating Data Processing for Analysing Data Produced from Eye Tracking and Galvanic Skin Response DOI Creative Commons

Javier Sáez-García,

María Consuelo Sáiz Manzanares, Raúl Marticorena Sánchez

et al.

Computers, Journal Year: 2024, Volume and Issue: 13(11), P. 289 - 289

Published: Nov. 8, 2024

The use of eye tracking technology, together with other physiological measurements such as psychogalvanic skin response (GSR) and electroencephalographic (EEG) recordings, provides researchers information about users’ behavioural responses during their learning process in different types tasks. These devices produce a large volume data. However, order to analyse these records, have them using complex statistical and/or machine techniques (supervised or unsupervised) that are usually not incorporated into the devices. objectives this study were (1) propose procedure for processing extracted data; (2) address potential technical challenges difficulties logs integrated multichannel technology; (3) offer solutions automating data analysis. A Notebook Jupyter is proposed steps importing data, well supervised unsupervised algorithms.

Language: Английский

Citations

0

FETrack: Feature-Enhanced Transformer Network for Visual Object Tracking DOI Creative Commons
Huan Liu, Detian Huang,

Mingxin Lin

et al.

Applied Sciences, Journal Year: 2024, Volume and Issue: 14(22), P. 10589 - 10589

Published: Nov. 17, 2024

Visual object tracking is a fundamental task in computer vision, with applications ranging from video surveillance to autonomous driving. Despite recent advances transformer-based one-stream trackers, unrestricted feature interactions between the template and search region often introduce background noise into template, degrading performance. To address this issue, we propose FETrack, feature-enhanced network for visual tracking. Specifically, incorporate an independent stream encoder of tracker acquire high-quality features while suppressing harmful effectively. Then, employ sequence-learning-based causal transformer decoder generate bounding box autoregressively, simplifying prediction head network. Further, present dynamic threshold-based online template-updating strategy template-filtering approach boost robustness reduce redundant computations. Extensive experiments demonstrate that our FETrack achieves superior performance over state-of-the-art trackers. proposed 75.1% AO on GOT-10k, 81.2% AUC LaSOT, 89.3% Pnorm TrackingNet.

Language: Английский

Citations

0