  Text to 3D is here faster than anyone expected - 🤖😋🧠 #5

This is Bot Eat Brain, the not-yet-daily newsletter that makes getting harvested for fuel by a robot uprising fun and sexy.

Here's what's on the menu for today:

  • Text to 3D is here ahead of schedule 🧊

  • On-demand audio clips from text 👂

  • Google's newest AI model is better and faster 🎙️

  • Industry predictions and your next unicorn startup ideas 🦄

Text to 3D is here faster than anyone expected 🧊

Text to 2D is so last week, this week we're stanning text to 3D 🦾

A team from Google Research just launched Dream Fusion, a tool to turn plain English text prompts into 3D models. It functions much like DALL•E or stable-diffusion, which can turn your prompts into 2D images.

The code hasn't been released, but as we've seen with previous projects, it's only a matter of time before open-source catches up and text-to-3D becomes a part of the communal tool chest for AI artists and entrepreneurs.

Munch on this AI-generated collage of 3D models for a hot second and lament the end of human hegemony over art:

Our predictions 🔮

  • Video game characters uniquely generated based on your game-play.

  • The cost to develop video games drops as the time-intensive task of 3D modeling suddenly gets much easier.

  • A bunch of boring-ass 3D NFT's launched in the next crypto-upswing.

Textually Guided Audio Generation 👂

Another cool as sh*t demo out today that lets you generate audio from a short text description.

Input "whistling with wind blowing" get... exactly what you expect.

It's like stable diffusion for random sound clips.

Upcoming Disruptions 🌋

  • Just like there are libraries of stock photos, there are libraries of stock-audio clips. And just like stock photos are seeing competition from AI, so will audio libraries.

  • Remember the end-to-end movie-production pipeline we proposed last week? Well, your sound effects just got way easier.

  • Podcasts, video games, and YouTube videos are going to become more immersive as everyone gets on-demand environment-building audio.

The narrator of your next audiobook is probably a robot 🎙️

Google just launched Lyra V2 "a better, faster, and more versatile speech codec". What the 'heck does that mean? It means your auto-generated audiobook narrator is about to sound a whole lot better.

Compare the results of Lyrva vs competitor-model Opus.

If you're not sure how to interpret that graph, we made a helpful companion:

TL;DR Lyra works better and faster. Check out the launch page to listen for yourself.

Bonus treats 🍪

