Jointly modeling deep video and compositional text to bridge vision and language in a unified framework

  • Ran Xu
  • , Caiming Xiong
  • , Wei Chen
  • , Jason J. Corso

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

183 Scopus citations

Fingerprint

Dive into the research topics of 'Jointly modeling deep video and compositional text to bridge vision and language in a unified framework'. Together they form a unique fingerprint.

Computer Science