Visual Question Answering Vqa On Ai2D
Metrics
EM
Results
Performance results of various models on this benchmark
Model Name | EM | Paper Title | Repository |
---|---|---|---|
DUBLIN | 51.11 | DUBLIN -- Document Understanding By Language-Image Network | - |
Gemini Ultra | 79.5 | Gemini: A Family of Highly Capable Multimodal Models | |
SMoLA-PaLI-X Specialist Model | 82.5 | Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | - |
SMoLA-PaLI-X Generalist Model | 81.4 | Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts | - |
0 of 4 row(s) selected.