HyperAI超神经

Text To Video Generation On Msr Vtt

评估指标

CLIPSIM
FID
FVD

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称CLIPSIMFIDFVD
modelscope-text-to-video-technical-report0.293011.09550
align-your-latents-high-resolution-video0.2929--
a-recipe-for-scaling-up-text-to-video0.29918.19441
nuwa-visual-synthesis-pre-training-for-neural0.243947.68-
make-pixels-dance-high-dynamic-video0.3125-381
make-a-video-text-to-video-generation-without0.304913.17-
godiva-generating-open-domain-videos-from0.2402--
tell-me-what-happened-unifying-text-guided0.264423.4-
make-a-video-text-to-video-generation-without0.263123.59-
snap-video-scaled-spatiotemporal-transformers0.2793-104.0
videopoet-a-large-language-model-for-zero0.3123-213
magicvideo-efficient-video-generation-with-36.5998
hierarchical-spatio-temporal-decoupling-for0.29478.60406
video-lavit-unified-video-language-pre0.301211.27188.36
align-your-latents-high-resolution-video0.2614--
videocomposer-compositional-video-synthesis0.2932-580
show-1-marrying-pixel-and-latent-diffusion0.307213.08538
snap-video-scaled-spatiotemporal-transformers0.2793-110.4