Text to 3D is here faster than anyone expected - 🤖😋🧠 #5
Good morning. This is Bot Eat Brain, the not-yet-daily newsletter that makes getting harvested for fuel by a robot uprising fun and sexy.
Here's what's on the menu for today:
- Text to 3D is here ahead of schedule 🧊
- On-demand audio clips from text 👂
- Google's newest AI model is better and faster 🎙️
- Industry predictions and your next unicorn startup ideas 🦄
Text to 3D is here faster than anyone expected 🧊
Text to 2D is so last week, this week we're stanning text to 3D 🦾
A team from Google Research just launched Dream Fusion, a tool to turn plain English text prompts into 3D models. It functions much like DALL•E or stable-diffusion, which can turn your prompts into 2D images.
The code hasn't been released, but as we've seen with previous projects, it's only a matter of time before open-source catches up and text-to-3D becomes a part of the communal tool chest for AI artists and entrepreneurs.
Munch on this AI-generated collage of 3D models for a hot second and lament the end of human hegemony over art:
Happy to announce DreamFusion, our new method for Text-to-3D!
We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed!
Joint work w/ the incredible team of @BenMildenhall @ajayj_@jon_barron
— Ben Poole (@poolio)
Sep 29, 2022
Our predictions 🔮
- Video game characters uniquely generated based on your game-play.
- The cost to develop video games drops as the time-intensive task of 3D modeling suddenly gets much easier.
- A bunch of boring-ass 3D NFT's launched in the next crypto-upswing.
Textually Guided Audio Generation 👂
Another cool as sh*t demo out today that lets you generate audio from a short text description.
Input "whistling with wind blowing" get... exactly what you expect.
It's like stable diffusion for random sound clips.
Upcoming Disruptions 🌋
- Just like there are libraries of stock photos, there are libraries of stock-audio clips. And just like stock photos are seeing competition from AI, so will audio libraries.
- Remember the end-to-end movie-production pipeline we proposed last week? Well, your sound effects just got way easier.
- Podcasts, video games, and YouTube videos are going to become more immersive as everyone gets on-demand environment-building audio.
The narrator of your next audiobook is probably a robot 🎙️
Google just launched Lyra V2 "a better, faster, and more versatile speech codec". What the 'heck does that mean? It means your auto-generated audiobook narrator is about to sound a whole lot better.
Compare the results of Lyrva vs competitor-model Opus.
If you're not sure how to interpret that graph, we made a helpful companion:
TL;DR Lyra works better and faster. Check out the launch page to listen for yourself.
Bonus treats 🍪
Credits and Shoutouts 🤘
Thanks to everyone who's been sending us cool AI shit to include in the newsletter
See something cool? Email it to us at [email protected] or by DM'ing us on Twitter @BotEatBrain
Until next time ✌️