HyperAI超神经

Interactive Evaluation Of Dialog On Dstc9

评估指标

Coherent
Consistent
Diversity
Error Recovery
Flexible
Informative
Inquisitive
Likeable
Overall Human Rating
Topic Depth
Understanding

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称CoherentConsistentDiversityError RecoveryFlexibleInformativeInquisitiveLikeableOverall Human RatingTopic DepthUnderstanding
a-unified-pre-training-framework-for2.80170.93902.7441 2.75182.80002.78812.79492.78784.152.76782.8285