Monocular 3D Tooltip Tracking in Robotic Surgery—Building a Multi-Stage Pipeline DOI Open Access
Sanjeev Narasimhan, Mehmet Kerem Türkcan, Mattia Ballo

et al.

Electronics, Journal Year: 2025, Volume and Issue: 14(10), P. 2075 - 2075

Published: May 20, 2025

Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking tooltips challenging to implement when using monocular videos due complexity extracting depth information. We propose a pipeline that combines state-of-the-art foundation models—Florence2 Segment Anything 2 (SAM2)—for zero-shot 2D localization tooltip coordinates video input. Localization predictions are refined through supervised training YOLOv11 segmentation model enable real-time applications. The estimation Metric3D computes relative provides camera coordinates, which subsequently transformed into world via linear estimating rotation translation parameters. An experimental evaluation on JIGSAWS Suturing Kinematic dataset achieves Average Jaccard score 84.5 91.2 approaches, respectively. results validate effectiveness our approach its potential enhance guidance assessment procedures.

Language: Английский

Monocular 3D Tooltip Tracking in Robotic Surgery—Building a Multi-Stage Pipeline DOI Open Access
Sanjeev Narasimhan, Mehmet Kerem Türkcan, Mattia Ballo

et al.

Electronics, Journal Year: 2025, Volume and Issue: 14(10), P. 2075 - 2075

Published: May 20, 2025

Tracking the precise movement of surgical tools is essential for enabling automated analysis, providing feedback, and enhancing safety in robotic-assisted surgery. Accurate 3D tracking tooltips challenging to implement when using monocular videos due complexity extracting depth information. We propose a pipeline that combines state-of-the-art foundation models—Florence2 Segment Anything 2 (SAM2)—for zero-shot 2D localization tooltip coordinates video input. Localization predictions are refined through supervised training YOLOv11 segmentation model enable real-time applications. The estimation Metric3D computes relative provides camera coordinates, which subsequently transformed into world via linear estimating rotation translation parameters. An experimental evaluation on JIGSAWS Suturing Kinematic dataset achieves Average Jaccard score 84.5 91.2 approaches, respectively. results validate effectiveness our approach its potential enhance guidance assessment procedures.

Language: Английский

Citations

0