Dynamic graph modules for modeling object-object interactions in activity recognition

Hao Huang, Luowei Zhou, Wei Zhang, Jason J. Corso, Chenliang Xu

Research output: Contribution to conferencePaperpeer-review

5 Scopus citations

Abstract

Video action recognition, a critical problem in video understanding, has been gaining increasing attention. To identify actions induced by complex object-object interactions, we need to consider not only spatial relations among objects in a single frame, but also temporal relations among different objects or the same object across multiple frames. However, existing approaches that model video representations and non-local features are either incapable of explicitly modeling relations at the object-object level or unable to handle streaming videos. In this paper, we propose a novel dynamic hidden graph module to model complex object-object interactions in videos, of which two instantiations are considered: a visual graph that captures appearance/motion changes among objects and a location graph that captures relative spatiotemporal position changes among objects. Additionally, the proposed graph module allows us to process streaming videos, setting it apart from existing methods. Experimental results on benchmark datasets, Something-Something and ActivityNet, show the competitive performance of our method.

Original languageEnglish
StatePublished - 2020
Event30th British Machine Vision Conference, BMVC 2019 - Cardiff, United Kingdom
Duration: 9 Sep 201912 Sep 2019

Conference

Conference30th British Machine Vision Conference, BMVC 2019
Country/TerritoryUnited Kingdom
CityCardiff
Period9/09/1912/09/19

Fingerprint

Dive into the research topics of 'Dynamic graph modules for modeling object-object interactions in activity recognition'. Together they form a unique fingerprint.

Cite this