Visual Question Answering Vqa On Ai2D

Results

Performance results of various models on this benchmark

Model Name	EM	Paper Title	Repository
DUBLIN	51.11	DUBLIN -- Document Understanding By Language-Image Network	-
Gemini Ultra	79.5	Gemini: A Family of Highly Capable Multimodal Models
SMoLA-PaLI-X Specialist Model	82.5	Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts	-
SMoLA-PaLI-X Generalist Model	81.4	Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts	-

0 of 4 row(s) selected.