Bot Eat Brain
Posts
Text to 3D is here faster than anyone expected - 🤖😋🧠 #5

Text to 3D is here faster than anyone expected - 🤖😋🧠 #5

Anthony Castrio
October 03, 2022

Good morning. This is Bot Eat Brain, the not-yet-daily newsletter that makes getting harvested for fuel by a robot uprising fun and sexy.

Here's what's on the menu for today:

Text to 3D is here ahead of schedule 🧊
On-demand audio clips from text 👂
Google's newest AI model is better and faster 🎙️
Industry predictions and your next unicorn startup ideas 🦄

Text to 3D is here faster than anyone expected 🧊

Text to 2D is so last week, this week we're stanning text to 3D 🦾

A team from Google Research just launched Dream Fusion, a tool to turn plain English text prompts into 3D models. It functions much like DALL•E or stable-diffusion, which can turn your prompts into 2D images.

The code hasn't been released, but as we've seen with previous projects, it's only a matter of time before open-source catches up and text-to-3D becomes a part of the communal tool chest for AI artists and entrepreneurs.

Munch on this AI-generated collage of 3D models for a hot second and lament the end of human hegemony over art:

Happy to announce DreamFusion, our new method for Text-to-3D!
dreamfusion3d.github.io
We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed!
Joint work w/ the incredible team of @BenMildenhall @ajayj_@jon_barron
#dreamfusion
— Ben Poole (@poolio)
8:01 PM • Sep 29, 2022

Our predictions 🔮

Video game characters uniquely generated based on your game-play.
The cost to develop video games drops as the time-intensive task of 3D modeling suddenly gets much easier.
A bunch of boring-ass 3D NFT's launched in the next crypto-upswing.

Textually Guided Audio Generation 👂

Another cool as sh*t demo out today that lets you generate audio from a short text description.

Input "whistling with wind blowing" get... exactly what you expect.

It's like stable diffusion for random sound clips.

Upcoming Disruptions 🌋

Just like there are libraries of stock photos, there are libraries of stock-audio clips. And just like stock photos are seeing competition from AI, so will audio libraries.
Remember the end-to-end movie-production pipeline we proposed last week? Well, your sound effects just got way easier.
Podcasts, video games, and YouTube videos are going to become more immersive as everyone gets on-demand environment-building audio.

The narrator of your next audiobook is probably a robot 🎙️

Google just launched Lyra V2 "a better, faster, and more versatile speech codec". What the 'heck does that mean? It means your auto-generated audiobook narrator is about to sound a whole lot better.

Compare the results of Lyrva vs competitor-model Opus.