常识推理
Common Sense Reasoning任务旨在使模型超越模式识别,运用常识或世界知识进行推理。其目标是让模型能够理解复杂情境,做出合理判断与预测,提高在自然语言处理、对话系统等领域的智能化水平与应用效果。
WinoGrande
PaLM 540B (0-shot)
ARC (Challenge)
ARC (Easy)
ST-MoE-32B 269B (fine-tuned)
ReCoRD
DeBERTa-1.5B
CommonsenseQA
MUPPET Roberta Large
PARus
RuCoS
RWSD
BIG-bench (Disambiguation QA)
BIG-bench (Causal Judgment)
BIG-bench (Date Understanding)
BIG-bench (Sports Understanding)
Event2Mind test
EA-VQ-VAE
Russian Event2Mind
araneum word2vec (skipgram) + GRU
SWAG
DeBERTalarge
BIG-bench (Winowhy)
BIG-bench (Known Unknowns)
PaLM-540B (few-shot, k=5)
BIG-bench (Logical Sequence)
Chinchilla-70B (few-shot, k=5)
Event2Mind dev
CODAH
BERT Large
CrowdSource QA
Visual Dialog v0.9
NMN [kottur2018visual]
Visual Dialog v0.9
WinoGAViL
ViLT