HyperAI超神经

Sportqa

评估指标

level-1
level-2
level-3 easy multi-hop
level-3 easy single-hop
level-3 hard multi-hop
level-3 hard single-hop
llm_model
model_url
organization
parameters
release_date
updated_time

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称level-1level-2level-3 easy multi-hoplevel-3 easy single-hoplevel-3 hard multi-hoplevel-3 hard single-hopllm_modelmodel_urlorganizationparametersrelease_dateupdated_time
模型 150.9052.3214.8021.469.2015.16Llama2-13b(0S,CoT)https://huggingface.co/meta-llama/Llama-2-13bMeta13B2023.7.192024.6.16