TY - CONF
T1 - GENIE TRECVID2011 multimedia event detection
T2 - TREC Video Retrieval Evaluation, TRECVID 2011
AU - Perera, Amitha A.G.
AU - Oh, Sangmin
AU - Leotta, Matthew
AU - Kim, Ilseo
AU - Byun, Byuungki
AU - Lee, Chin Hui
AU - McCloskey, Scott
AU - Liu, Jingchen
AU - Miller, Ben
AU - Huang, Zhi Feng
AU - Vahdat, Arash
AU - Yang, Weilong
AU - Mori, Greg
AU - Tang, Kevin
AU - Koller, Daphne
AU - Fei-Fei, L.
AU - Li, Kang
AU - Chen, Gang
AU - Corso, Jason
AU - Fu, Yun
AU - Srihari, Rohini
PY - 2011
Y1 - 2011
N2 - For TRECVID 2011 MED task, the GENIE system incorporated two late-fusion approaches where multiple discriminative base-classifiers are built per feature, then, combined later through discriminative fusion techniques. All of our fusion and base classifiers are formulated as one-vs-all detectors per event class along with threshold estimation capabilities during cross-validation. Total of five different types of features were extracted from data, which include both audio or visual features: HOG3D, Object Bank, Gist, MFCC, and acoustic segment models (ASMs). Features such as HOG3D and MFCC are low-level features while Object Bank and ASMs are more semantic. In our work, event-specific feature adaptations or manual annotations were deliberately avoided, to establish a strong baseline results. Overall, the results were competitive in the MED11 evaluation, and shows that standard machine learning techniques can yield fairly good results even on a challenging dataset.
AB - For TRECVID 2011 MED task, the GENIE system incorporated two late-fusion approaches where multiple discriminative base-classifiers are built per feature, then, combined later through discriminative fusion techniques. All of our fusion and base classifiers are formulated as one-vs-all detectors per event class along with threshold estimation capabilities during cross-validation. Total of five different types of features were extracted from data, which include both audio or visual features: HOG3D, Object Bank, Gist, MFCC, and acoustic segment models (ASMs). Features such as HOG3D and MFCC are low-level features while Object Bank and ASMs are more semantic. In our work, event-specific feature adaptations or manual annotations were deliberately avoided, to establish a strong baseline results. Overall, the results were competitive in the MED11 evaluation, and shows that standard machine learning techniques can yield fairly good results even on a challenging dataset.
UR - http://www.scopus.com/inward/record.url?scp=84905269591&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905269591&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:84905269591
Y2 - 5 December 2011 through 7 December 2011
ER -