
arXiv (Cornell University), Год журнала: 2022, Номер unknown
Опубликована: Янв. 1, 2022
Assessing the critical view of safety in laparoscopic cholecystectomy requires accurate identification and localization key anatomical structures, reasoning about their geometric relationships to one another, determining quality exposure. Prior works have approached this task by including semantic segmentation as an intermediate step, using predicted masks then predict CVS. While these methods are effective, they rely on extremely expensive ground-truth annotations tend fail when is incorrect, limiting generalization. In work, we propose a method for CVS prediction wherein first represent surgical image disentangled latent scene graph, process representation graph neural network. Our representations explicitly encode information - object location, class information, relations improve anatomy-driven reasoning, well visual features retain differentiability thereby provide robustness errors. Finally, address annotation cost, train our only bounding box annotations, incorporating auxiliary reconstruction objective learn fine-grained boundaries. We show that not outperforms several baseline trained with but also scales effectively masks, maintaining state-of-the-art performance.
Язык: Английский