Video Question Answering On Lsmdc Mc

Accuracy

Results

Performance results of various models on this benchmark

Model Name	Accuracy	Paper Title	Repository
VIOLETv2	84.4	An Empirical Study of End-to-End Video-Language Transformers with Masked Visual Modeling
Clover	83.7	Clover: Towards A Unified Video-Language Alignment and Fusion Model

0 of 2 row(s) selected.