Actor-Action Semantic Segmentation with Grouping Process Models

Chenliang Xu, Jason J. Corso

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

32 Scopus citations

Abstract

Actor-action semantic segmentation made an important step toward advanced video understanding: what action is happening, who is performing the action, and where is the action happening in space-time. Current methods based on layered CRFs for this problem are local and unable to capture the long-ranging interactions of video parts. We propose a new model that combines the labeling CRF with a supervoxel hierarchy, where supervoxels at various scales provide cues for possible groupings of nodes in the CRF to encourage adaptive and long-ranging interactions. The new model defines a dynamic and continuous process of information exchange: the CRF influences what supervoxels in the hierarchy are active, and these active supervoxels, in turn, affect the connectivities in the CRF, we hence call it a grouping process model. By further incorporating the video-level recognition, the proposed method achieves a large margin of 60% relative improvement over the state of the art on the recent A2D large-scale video labeling dataset, which demonstrates the effectiveness of our modeling.

Original languageEnglish
Title of host publicationProceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Pages3083-3092
Number of pages10
ISBN (Electronic)9781467388504
DOIs
StatePublished - 9 Dec 2016
Event29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, United States
Duration: 26 Jun 20161 Jul 2016

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2016-December
ISSN (Print)1063-6919

Conference

Conference29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Country/TerritoryUnited States
CityLas Vegas
Period26/06/161/07/16

Fingerprint

Dive into the research topics of 'Actor-Action Semantic Segmentation with Grouping Process Models'. Together they form a unique fingerprint.

Cite this