Jointly modeling deep video and compositional text to bridge vision and language in a unified framework

Ran Xu, Caiming Xiong, Wei Chen, Jason J. Corso

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

176 Scopus citations

Fingerprint

Dive into the research topics of 'Jointly modeling deep video and compositional text to bridge vision and language in a unified framework'. Together they form a unique fingerprint.

Computer Science