HyperAI超神经

Question Answering On Fever

评估指标

EM

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称EM
measuring-and-narrowing-the-compositionality64.2
chain-of-action-faithful-and-multimodal54.2
language-models-are-unsupervised-multitask50
chain-of-action-faithful-and-multimodal64.2
chain-of-action-faithful-and-multimodal50
chain-of-action-faithful-and-multimodal68.9
dspy-compiling-declarative-language-model62.2
chain-of-action-faithful-and-multimodal62.2