DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained
❤️ Become The AI Epiphany Patreon ❤️ ►
In this video I cover DALL-E or “Zero-Shot Text-to-Image Generation“ paper by OpenAI team.
They train a VQ-VAE to learn compressed image representations and then they train an autoregressive transformer on top of that discrete latent space and BPEd text.
The model learns to combine distinct concepts in a plausible way, image to image capabilities emerge, etc.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper:
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
⌚️ Timetable:
00:00 What is DALL-E?
03:25 VQ-VAE blur problems
05:15 transformers
7 views
28
6
1 year ago 00:16:44 1
What are Transformer Neural Networks?
3 years ago 00:31:00 1
Robust Fine-Tuning of Zero-Shot Models
3 years ago 00:33:27 7
DALL-E: Zero-Shot Text-to-Image Generation | Paper Explained