- Bot Eat Brain
- Posts
- Text to 3D is here faster than anyone expected - ๐ค๐๐ง #5
Text to 3D is here faster than anyone expected - ๐ค๐๐ง #5

Good morning. This is Bot Eat Brain, the not-yet-daily newsletter that makes getting harvested for fuel by a robot uprising fun and sexy.
Here's what's on the menu for today:
Text to 3D is here ahead of schedule ๐ง
On-demand audio clips from text ๐
Google's newest AI model is better and faster ๐๏ธ
Industry predictions and your next unicorn startup ideas ๐ฆ

Text to 3D is here faster than anyone expected ๐ง
Text to 2D is so last week, this week we're stanning text to 3D ๐ฆพ
A team from Google Research just launched Dream Fusion, a tool to turn plain English text prompts into 3D models. It functions much like DALLโขE or stable-diffusion, which can turn your prompts into 2D images.

The code hasn't been released, but as we've seen with previous projects, it's only a matter of time before open-source catches up and text-to-3D becomes a part of the communal tool chest for AI artists and entrepreneurs.
Munch on this AI-generated collage of 3D models for a hot second and lament the end of human hegemony over art:
Happy to announce DreamFusion, our new method for Text-to-3D!
dreamfusion3d.github.io
We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed!
Joint work w/ the incredible team of @BenMildenhall @ajayj_@jon_barron
#dreamfusion
โ Ben Poole (@poolio)
8:01 PM โข Sep 29, 2022
Our predictions ๐ฎ
Video game characters uniquely generated based on your game-play.
The cost to develop video games drops as the time-intensive task of 3D modeling suddenly gets much easier.
A bunch of boring-ass 3D NFT's launched in the next crypto-upswing.
Textually Guided Audio Generation ๐
Another cool as sh*t demo out today that lets you generate audio from a short text description.
Input "whistling with wind blowing" get... exactly what you expect.
It's like stable diffusion for random sound clips.

Upcoming Disruptions ๐
Just like there are libraries of stock photos, there are libraries of stock-audio clips. And just like stock photos are seeing competition from AI, so will audio libraries.
Remember the end-to-end movie-production pipeline we proposed last week? Well, your sound effects just got way easier.
Podcasts, video games, and YouTube videos are going to become more immersive as everyone gets on-demand environment-building audio.
The narrator of your next audiobook is probably a robot ๐๏ธ
Google just launched Lyra V2 "a better, faster, and more versatile speech codec". What the 'heck does that mean? It means your auto-generated audiobook narrator is about to sound a whole lot better.
Compare the results of Lyrva vs competitor-model Opus.

If you're not sure how to interpret that graph, we made a helpful companion:

TL;DR Lyra works better and faster. Check out the launch page to listen for yourself.
Bonus treats ๐ช
Credits and Shoutouts ๐ค
Thanks to everyone who's been sending us cool AI shit to include in the newsletter
See something cool? Email it to us at [email protected] or by DM'ing us on Twitter @BotEatBrain
Until next time โ๏ธ
๐ค๐๐ง