Surgical Tattoos in Infrared: A Dataset for Quantifying Tissue Tracking and Mapping DOI
Adam Schmidt,

Omid Mohareri,

Simon DiMaio

et al.

IEEE Transactions on Medical Imaging, Journal Year: 2024, Volume and Issue: 43(7), P. 2634 - 2645

Published: March 4, 2024

Quantifying performance of methods for tracking and mapping tissue in endoscopic environments is essential enabling image guidance automation medical interventions surgery. Datasets developed so far either use rigid environments, visible markers, or require annotators to label salient points videos after collection. These are respectively: not general, algorithms, costly error-prone. We introduce a novel labeling methodology along with dataset that uses said methodology, Surgical Tattoos Infrared (STIR). STIR has labels persistent but invisible spectrum algorithms. This done by labelling IR-fluorescent dye, indocyanine green (ICG), then collecting light video clips. comprises hundreds stereo clips both vivo ex scenes start end labelled the IR spectrum. With over 3,000 points, will help quantify enable better analysis methods. After introducing STIR, we analyze multiple different frame-based on using 3D 2D endpoint error accuracy metrics. available at https://dx.doi.org/10.21227/w8g4-g548.

Language: Английский

A vision transformer for decoding surgeon activity from surgical videos DOI Creative Commons
Dani Kiyasseh, Runzhuo Ma, Taseen F. Haque

et al.

Nature Biomedical Engineering, Journal Year: 2023, Volume and Issue: 7(6), P. 780 - 796

Published: March 30, 2023

Abstract The intraoperative activity of a surgeon has substantial impact on postoperative outcomes. However, for most surgical procedures, the details actions, which can vary widely, are not well understood. Here we report machine learning system leveraging vision transformer and supervised contrastive decoding elements from videos commonly collected during robotic surgeries. accurately identified steps, actions performed by surgeon, quality these relative contribution individual video frames to actions. Through extensive testing data three different hospitals located in two continents, show that generalizes across videos, surgeons, it provide information gestures skills unannotated videos. Decoding via accurate systems could be used surgeons with feedback their operating skills, may allow identification optimal behaviour study relationships between factors

Language: Английский

Citations

69

Artificial intelligence and automation in endoscopy and surgery DOI
François Chadebecq, Laurence Lovat, Danail Stoyanov

et al.

Nature Reviews Gastroenterology & Hepatology, Journal Year: 2022, Volume and Issue: 20(3), P. 171 - 182

Published: Nov. 9, 2022

Language: Английский

Citations

65

Deep Learning Based Image Processing for Robot Assisted Surgery: A Systematic Literature Survey DOI Creative Commons
Sardar Mehboob Hussain, Antonio Brunetti, Giuseppe Lucarelli

et al.

IEEE Access, Journal Year: 2022, Volume and Issue: 10, P. 122627 - 122657

Published: Jan. 1, 2022

The recent advancements in the surging field of Deep Learning (DL) have revolutionized every sphere life, and healthcare domain is no exception. enormous success DL models, particularly with image data, has led to development image-guided Robot Assisted Surgery (RAS) systems. By large, number studies concerning image-driven computer assisted surgical systems using increased exponentially. Additionally, contemporary availability datasets also boosted applications RAS. Inspired by latest trends contributions surgery, this literature survey presents a summarized analysis innovations RAS After thorough review, sum 184 articles are selected grouped into four categories, based on relevancy task articles, comprising 1) Surgical Tools, 2) Processes, 3) Surveillance, 4) Performance. discusses publicly available highlights basics models. Furthermore, legal, ethical, technological challenges together intuitive predictions recommendations related autonomous presented. study reveals that Convolutional Neural Network (CNN) most widely adopted architecture, whereas, JIGSAWS employed dataset suggests fusing kinematic data along which produces better accuracy precision, gesture trajectory segmentation tasks. CNN Long Short Term Memory (LSTM) networks shown remarkable performance, however, authors recommend employing these gigantic architectures only when simpler models failed produce satisfactory results. despite their limitations, time cost effective yield considerable outcomes even smaller datasets.

Language: Английский

Citations

25

Sports game teaching and high precision sports training system based on virtual reality technology DOI
Yang Pan

Entertainment Computing, Journal Year: 2024, Volume and Issue: 50, P. 100662 - 100662

Published: March 24, 2024

Language: Английский

Citations

6

Multimodal vision-based human action recognition using deep learning: a review DOI Creative Commons

Fatemeh Shafizadegan,

Ahmad Reza Naghsh‐Nilchi, Elham Shabaninia

et al.

Artificial Intelligence Review, Journal Year: 2024, Volume and Issue: 57(7)

Published: June 19, 2024

Abstract Vision-based Human Action Recognition (HAR) is a hot topic in computer vision. Recently, deep-based HAR has shown promising results. using single data modality common approach; however, the fusion of different sources essentially conveys complementary information and improves This paper comprehensively reviews methods multiple visual modalities. The main contribution this categorizing existing into four levels, which provides an in-depth comparable analysis approaches various aspects. So, at first level, proposed are categorized based on employed At second level classified employment complete modalities or working with missing test time. third branches approaches. Finally, similar frameworks category grouped together. In addition, comprehensive comparison provided for publicly available benchmark datasets, helps to compare choose suitable datasets task develop new datasets. also compares performance state-of-the-art review concludes by highlighting several future directions.

Language: Английский

Citations

5

ASPnet: Action Segmentation with Shared-Private Representation of Multiple Data Sources DOI
Beatrice van Amsterdam, Abdolrahim Kadkhodamohammadi, Imanol Luengo

et al.

2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Journal Year: 2023, Volume and Issue: unknown, P. 2384 - 2393

Published: June 1, 2023

Most state-of-the-art methods for action segmentation are based on single input modalities or naïve fusion of multiple data sources. However, effective complementary information can potentially strengthen models and make them more robust to sensor noise accurate with smaller training datasets. In order improve multimodal representation learning segmentation, we propose disentangle hidden features a multi-stream model into modality-shared components, containing common across sources, private components; then use an attention bottleneck capture long-range temporal dependencies in the while preserving disentanglement consecutive processing layers. Evaluation 50salads, Breakfast RARP45 datasets shows that our approach outperforms different baselines both multiview obtaining competitive better results compared state-of-the-art. Our is also additive achieve performance par strong video even less data.

Language: Английский

Citations

11

Systematic review of machine learning applications using nonoptical motion tracking in surgery DOI Creative Commons
Teona Z. Carciumaru,

Chuanqing Tang,

M. Farsi

et al.

npj Digital Medicine, Journal Year: 2025, Volume and Issue: 8(1)

Published: Jan. 14, 2025

Abstract This systematic review explores machine learning (ML) applications in surgical motion analysis using non-optical tracking systems (NOMTS), alone or with optical methods. It investigates objectives, experimental designs, model effectiveness, and future research directions. From 3632 records, 84 studies were included, Artificial Neural Networks (38%) Support Vector Machines (11%) being the most common ML models. Skill assessment was primary objective (38%). NOMTS used included internal device kinematics (56%), electromagnetic (17%), inertial (15%), mechanical (11%), electromyography (1%) sensors. Surgical settings robotic (60%), laparoscopic (18%), open (16%), others (6%). Procedures focused on bench-top tasks (67%), clinical models simulations (9%), non-clinical (7%). Over 90% accuracy achieved 36% of studies. Literature shows can enhance precision, assessment, training. Future should advance environments, ensure interpretability reproducibility, use larger datasets for accurate evaluation.

Language: Английский

Citations

0

Graph Convolutional Networks for multi-modal robotic martial arts leg pose recognition DOI Creative Commons

Shun Yao,

Yihan Ping,

Xiaoyu Yue

et al.

Frontiers in Neurorobotics, Journal Year: 2025, Volume and Issue: 18

Published: Jan. 20, 2025

Introduction Accurate recognition of martial arts leg poses is essential for applications in sports analytics, rehabilitation, and human-computer interaction. Traditional pose models, relying on sequential or convolutional approaches, often struggle to capture the complex spatial-temporal dependencies inherent movements. These methods lack ability effectively model nuanced dynamics joint interactions temporal progression, leading limited generalization recognizing actions. Methods To address these challenges, we propose PoseGCN, a Graph Convolutional Network (GCN)-based that integrates spatial, temporal, contextual features through novel framework. PoseGCN leverages graph encoding motion dynamics, an action-specific attention mechanism assign importance relevant joints depending action context, self-supervised pretext task enhance robustness continuity. Experimental results four benchmark datasets—Kinetics-700, Human3.6M, NTU RGB+D, UTD-MHAD—demonstrate outperforms existing achieving state-of-the-art accuracy F1 scores. Results discussion findings highlight model's capacity generalize across diverse datasets fine-grained details, showcasing its potential advancing tasks. The proposed framework offers robust solution precise paves way future developments multi-modal analysis.

Language: Английский

Citations

0

Automated Feedback System for Surgical Skill Improvement in Endoscopic Sinus Surgery DOI
Tomoko Yamaguchi,

Ryoichi Nakamura,

Akihito Kuboki

et al.

Lecture notes in computer science, Journal Year: 2025, Volume and Issue: unknown, P. 151 - 161

Published: Jan. 1, 2025

Language: Английский

Citations

0

Surgical gestures—An emerging field for surgical assessment and training DOI Creative Commons
Runzhuo Ma

UroPrecision, Journal Year: 2025, Volume and Issue: unknown

Published: Feb. 18, 2025

Abstract As surgical training shifts from a traditional method to more standardized approach, objective analysis and assessment of surgeon performance has become key focus. Surgical gestures, defined as the smallest independent units instrument‐tissue interaction, offer quantifiable way analyze performance. Standardizing terminology for describing gestures can enhance communication during in operating room. More importantly, gesture usage been linked expertise shown be associated with patient outcomes. This review examines current classification systems dissection suturing tasks, across open, laparoscopic, robotic procedures, which serve an armamentarium surgeons. It also explores how complement conventional tools. Finally, it reviews artificial intelligent models on recognition automation, envisions future where forms foundation assistance

Language: Английский

Citations

0