TY - GEN
T1 - Surgical tool attributes from monocular video
AU - Kumar, Suren
AU - Narayanan, Madusudanan Sathia
AU - Singhal, Pankaj
AU - Corso, Jason J.
AU - Krovi, Venkat
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2014/9/22
Y1 - 2014/9/22
N2 - HD Video from the (monocular or binocular) endoscopic camera provides a rich real-time sensing channel from surgical site to the surgeon console in various Minimally Invasive Surgery (MIS) procedures. However, a real-time framework for video understanding would be critical for tapping into the rich information-content provided by the non-invasive and well-established digital endoscopic video-streaming modality. While contemporary research focuses on enhancing aspects such as tool-tracking within the challenging visual scenes, we consider the associated problem of using that rich (but often compromised) streaming visual data to discover the underlying semantic attributes of the tools. Directly analyzing the surgical videos to extract more realistic attributes online can aid in the decision-making and feedback aspects. We propose a novel probabilistic attribute labelling framework with Bayesian filtering to identify associated semantics (open/closed, stained with blood etc.) to ultimately give semantic feedback to the surgeon. Our robust video-understanding framework overcomes many of the challenges (tissue deformations, image specularities, clutter, tool-occlusion due to blood and/or organs) under realistic in-vivo surgical conditions. Specifically, this manuscript performs rigorous experimental analysis of the resulting method with varying parameters and different visual features on a data-corpus consisting of real surgical procedures performed on patients with da Vinci Surgical System [9].
AB - HD Video from the (monocular or binocular) endoscopic camera provides a rich real-time sensing channel from surgical site to the surgeon console in various Minimally Invasive Surgery (MIS) procedures. However, a real-time framework for video understanding would be critical for tapping into the rich information-content provided by the non-invasive and well-established digital endoscopic video-streaming modality. While contemporary research focuses on enhancing aspects such as tool-tracking within the challenging visual scenes, we consider the associated problem of using that rich (but often compromised) streaming visual data to discover the underlying semantic attributes of the tools. Directly analyzing the surgical videos to extract more realistic attributes online can aid in the decision-making and feedback aspects. We propose a novel probabilistic attribute labelling framework with Bayesian filtering to identify associated semantics (open/closed, stained with blood etc.) to ultimately give semantic feedback to the surgeon. Our robust video-understanding framework overcomes many of the challenges (tissue deformations, image specularities, clutter, tool-occlusion due to blood and/or organs) under realistic in-vivo surgical conditions. Specifically, this manuscript performs rigorous experimental analysis of the resulting method with varying parameters and different visual features on a data-corpus consisting of real surgical procedures performed on patients with da Vinci Surgical System [9].
UR - http://www.scopus.com/inward/record.url?scp=84929190637&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84929190637&partnerID=8YFLogxK
U2 - 10.1109/ICRA.2014.6907575
DO - 10.1109/ICRA.2014.6907575
M3 - Conference contribution
AN - SCOPUS:84929190637
T3 - Proceedings - IEEE International Conference on Robotics and Automation
SP - 4887
EP - 4892
BT - Proceedings - IEEE International Conference on Robotics and Automation
T2 - 2014 IEEE International Conference on Robotics and Automation, ICRA 2014
Y2 - 31 May 2014 through 7 June 2014
ER -