HyperAI超神经
首页
资讯
最新论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
首页
SOTA
文本到图像生成
Text To Image Generation On Coco
Text To Image Generation On Coco
评估指标
FID
Inception score
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
FID
Inception score
Paper Title
Repository
AttnGAN+CL
23.93
25.70
Improving Text-to-Image Synthesis Using Contrastive Learning
FuseDream (few-shot, k=5)
21.16
34.26
FuseDream: Training-Free Text-to-Image Generation with Improved CLIP+GAN Space Optimization
Corgi-Semi
10.6
-
Shifted Diffusion for Text-to-image Generation
-
Vanilla CM3
29.5
-
Retrieval-Augmented Multimodal Language Modeling
-
StackGAN-v1
74.05
8.45
StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
StyleGAN-T (Zero-shot, 256x256)
13.9
-
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis
Re-Imagen (Finetuned)
5.25
-
Re-Imagen: Retrieval-Augmented Text-to-Image Generator
-
GLIGEN (fine-tuned, Detection data only)
5.82
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
DF-GAN (256 x 256)
-
18.7
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
DALL-E (256 x 256)
27.5
17.9
NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion
Lafite
8.12
32.34
LAFITE: Towards Language-Free Training for Text-to-Image Generation
Make-a-Scene (unfiltered)
11.84
-
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors
-
DALL-E 2
10.39
-
Hierarchical Text-Conditional Image Generation with CLIP Latents
-
Imagen (zero-shot)
7.27
-
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding
VQ-Diffusion-F
13.86
-
Vector Quantized Diffusion Model for Text-to-Image Synthesis
RAT-Diffusion
5.00
-
Data Extrapolation for Text-to-image Generation on Small Datasets
-
Lafite (zero-shot)
26.94
26.02
LAFITE: Towards Language-Free Training for Text-to-Image Generation
eDiff-I (zero-shot)
6.95
-
eDiff-I: Text-to-Image Diffusion Models with an Ensemble of Expert Denoisers
L-Verse
45.8
-
L-Verse: Bidirectional Generation Between Image and Text
GLIGEN (fine-tuned, Grounding data)
6.38
-
GLIGEN: Open-Set Grounded Text-to-Image Generation
0 of 69 row(s) selected.
Previous
Next