TY - GEN
T1 - An optimization based framework for human pose estimation in monocular videos
AU - Agarwal, Priyanshu
AU - Kumar, Suren
AU - Ryde, Julian
AU - Corso, Jason J.
AU - Krovi, Venkat N.
PY - 2012
Y1 - 2012
N2 - Human pose estimation using monocular vision is a challenging problem in computer vision. Past work has focused on developing efficient inference algorithms and probabilistic prior models based on captured kinematic/dynamic measurements. However, such algorithms face challenges in generalization beyond the learned dataset. In this work, we propose a model-based generative approach for estimating the human pose solely from uncalibrated monocular video in unconstrained environments without any prior learning on motion capture/image annotation data. We propose a novel Product of Heading Experts (PoHE) based generalized heading estimation framework by probabilistically-merging heading outputs (probabilistic/ non-probabilistic) from time varying number of estimators to bootstrap a synergistically integrated probabilistic-deterministic sequential optimization framework for robustly estimating human pose. Novel pixel-distance based performance measures are developed to penalize false human detections and ensure identity-maintained human tracking. We tested our framework with varied inputs (silhouette and bounding boxes) to evaluate, compare and benchmark it against ground-truth data (collected using our human annotation tool) for 52 video vignettes in the publicly available DARPA Mind's Eye Year I dataset. Results show robust pose estimates on this challenging dataset of highly diverse activities.
AB - Human pose estimation using monocular vision is a challenging problem in computer vision. Past work has focused on developing efficient inference algorithms and probabilistic prior models based on captured kinematic/dynamic measurements. However, such algorithms face challenges in generalization beyond the learned dataset. In this work, we propose a model-based generative approach for estimating the human pose solely from uncalibrated monocular video in unconstrained environments without any prior learning on motion capture/image annotation data. We propose a novel Product of Heading Experts (PoHE) based generalized heading estimation framework by probabilistically-merging heading outputs (probabilistic/ non-probabilistic) from time varying number of estimators to bootstrap a synergistically integrated probabilistic-deterministic sequential optimization framework for robustly estimating human pose. Novel pixel-distance based performance measures are developed to penalize false human detections and ensure identity-maintained human tracking. We tested our framework with varied inputs (silhouette and bounding boxes) to evaluate, compare and benchmark it against ground-truth data (collected using our human annotation tool) for 52 video vignettes in the publicly available DARPA Mind's Eye Year I dataset. Results show robust pose estimates on this challenging dataset of highly diverse activities.
UR - http://www.scopus.com/inward/record.url?scp=84866710921&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84866710921&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-33179-4_55
DO - 10.1007/978-3-642-33179-4_55
M3 - Conference contribution
AN - SCOPUS:84866710921
SN - 9783642331787
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 575
EP - 586
BT - Advances in Visual Computing - 8th International Symposium, ISVC 2012, Revised Selected Papers
T2 - 8th International Symposium on Visual Computing, ISVC 2012
Y2 - 16 July 2012 through 18 July 2012
ER -