HyperAI超神经

Code Generation On Res Q

评估指标

pass@1

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称pass@1
res-q-evaluating-code-editing-large-language30.0
res-q-evaluating-code-editing-large-language58.0
res-q-evaluating-code-editing-large-language20.0
res-q-evaluating-code-editing-large-language18.0
res-q-evaluating-code-editing-large-language30.0
res-q-evaluating-code-editing-large-language36.0
res-q-evaluating-code-editing-large-language46.0
res-q-evaluating-code-editing-large-language29.0
res-q-evaluating-code-editing-large-language37.0