TY - GEN
T1 - Estimating human dynamics on-The-fly using monocular video for pose estimation
AU - Agarwal, Priyanshu
AU - Kumar, Suren
AU - Ryde, Julian
AU - Corso, Jason J.
AU - Krovi, Venkat N.
N1 - Publisher Copyright:
© 2013 Massachusetts Institute of Technology.
PY - 2013
Y1 - 2013
N2 - Human pose estimation using uncalibrated monocular visual inputs alone is a challenging problem for both the computer vision and robotics communities. From the robotics perspective, the challenge here is one of pose estimation of a multiply-Articulated system of bodies using a single nonspecialized environmental sensor (the camera) and thereby, creating low-order surrogate computational models for analysis and control. In this work, we propose a technique for estimating the lowerlimb dynamics of a human solely based on captured behavior using an uncalibrated monocular video camera. We leverage our previously developed framework for human pose estimation to (i) deduce the correct sequence of temporally coherent gap-filled pose estimates, (ii) estimate physical parameters, employing a dynamics model incorporating the anthropometric constraints, and (iii) filter out the optimized gap-filled pose estimates, using an Unscented Kalman Filter (UKF) with the estimated dynamicallyequivalent human dynamics model. We test the framework on videos from the publicly available DARPA Mind's Eye Year 1 corpus [8]. The combined estimation and filtering framework not only results in more accurate physically plausible pose estimates, but also provides pose estimates for frames, where the original human pose estimation framework failed to provide one.
AB - Human pose estimation using uncalibrated monocular visual inputs alone is a challenging problem for both the computer vision and robotics communities. From the robotics perspective, the challenge here is one of pose estimation of a multiply-Articulated system of bodies using a single nonspecialized environmental sensor (the camera) and thereby, creating low-order surrogate computational models for analysis and control. In this work, we propose a technique for estimating the lowerlimb dynamics of a human solely based on captured behavior using an uncalibrated monocular video camera. We leverage our previously developed framework for human pose estimation to (i) deduce the correct sequence of temporally coherent gap-filled pose estimates, (ii) estimate physical parameters, employing a dynamics model incorporating the anthropometric constraints, and (iii) filter out the optimized gap-filled pose estimates, using an Unscented Kalman Filter (UKF) with the estimated dynamicallyequivalent human dynamics model. We test the framework on videos from the publicly available DARPA Mind's Eye Year 1 corpus [8]. The combined estimation and filtering framework not only results in more accurate physically plausible pose estimates, but also provides pose estimates for frames, where the original human pose estimation framework failed to provide one.
UR - http://www.scopus.com/inward/record.url?scp=84959284734&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959284734&partnerID=8YFLogxK
U2 - 10.15607/rss.2012.viii.001
DO - 10.15607/rss.2012.viii.001
M3 - Conference contribution
AN - SCOPUS:84959284734
SN - 9780262519687
T3 - Robotics: Science and Systems
SP - 1
EP - 8
BT - Robotics
A2 - Newman, Paul
A2 - Roy, Nicholas
A2 - Srinivasa, Siddhartha
T2 - International Conference on Robotics Science and Systems, RSS 2012
Y2 - 9 July 2012 through 13 July 2012
ER -