A Weakly Supervised Multi-task Ranking Framework for Actor–Action Semantic Segmentation

Yan Yan, Chenliang Xu, Dawen Cai, Jason J. Corso

Research output: Contribution to journalArticlepeer-review

2 Scopus citations

Abstract

Modeling human behaviors and activity patterns has attracted significant research interest in recent years. In order to accurately model human behaviors, we need to perform fine-grained human activity understanding in videos. Fine-grained activity understanding in videos has attracted considerable recent attention with a shift from action classification to detailed actor and action understanding that provides compelling results for perceptual needs of cutting-edge autonomous systems. However, current methods for detailed understanding of actor and action have significant limitations: they require large amounts of finely labeled data, and they fail to capture any internal relationship among actors and actions. To address these issues, in this paper, we propose a novel Schatten p-norm robust multi-task ranking model for weakly-supervised actor–action segmentation where only video-level tags are given for training samples. Our model is able to share useful information among different actors and actions while learning a ranking matrix to select representative supervoxels for actors and actions respectively. Final segmentation results are generated by a conditional random field that considers various ranking scores for video parts. Extensive experimental results on both the actor–action dataset and the Youtube-objects dataset demonstrate that the proposed approach outperforms the state-of-the-art weakly supervised methods and performs as well as the top-performing fully supervised method.

Original languageEnglish
Pages (from-to)1414-1432
Number of pages19
JournalInternational Journal of Computer Vision
Volume128
Issue number5
DOIs
StatePublished - 1 May 2020

Keywords

  • Actor–action semantic segmentation
  • Multi-task ranking
  • Weakly supervised learning

Fingerprint

Dive into the research topics of 'A Weakly Supervised Multi-task Ranking Framework for Actor–Action Semantic Segmentation'. Together they form a unique fingerprint.

Cite this