Visual Question Answering On Tgif Qa

Accuracy

Results

Performance results of various models on this benchmark

Model Name	Accuracy	Paper Title	Repository
InternVideo	0.722	InternVideo: General Video Foundation Models via Generative and Discriminative Learning
HiTeA	0.732	HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training	-

0 of 2 row(s) selected.