• Bot Eat Brain
  • Posts
  • Text to 3D is here faster than anyone expected - ๐Ÿค–๐Ÿ˜‹๐Ÿง  #5

Text to 3D is here faster than anyone expected - ๐Ÿค–๐Ÿ˜‹๐Ÿง  #5

Good morning. This is Bot Eat Brain, the not-yet-daily newsletter that makes getting harvested for fuel by a robot uprising fun and sexy.

Here's what's on the menu for today:

  • Text to 3D is here ahead of schedule ๐ŸงŠ

  • On-demand audio clips from text ๐Ÿ‘‚

  • Google's newest AI model is better and faster ๐ŸŽ™๏ธ

  • Industry predictions and your next unicorn startup ideas ๐Ÿฆ„

Text to 3D is here faster than anyone expected ๐ŸงŠ

Text to 2D is so last week, this week we're stanning text to 3D ๐Ÿฆพ

A team from Google Research just launched Dream Fusion, a tool to turn plain English text prompts into 3D models. It functions much like DALLโ€ขE or stable-diffusion, which can turn your prompts into 2D images.

The code hasn't been released, but as we've seen with previous projects, it's only a matter of time before open-source catches up and text-to-3D becomes a part of the communal tool chest for AI artists and entrepreneurs.

Munch on this AI-generated collage of 3D models for a hot second and lament the end of human hegemony over art:

Our predictions ๐Ÿ”ฎ

  • Video game characters uniquely generated based on your game-play.

  • The cost to develop video games drops as the time-intensive task of 3D modeling suddenly gets much easier.

  • A bunch of boring-ass 3D NFT's launched in the next crypto-upswing.

Textually Guided Audio Generation ๐Ÿ‘‚

Another cool as sh*t demo out today that lets you generate audio from a short text description.

Input "whistling with wind blowing" get... exactly what you expect.

It's like stable diffusion for random sound clips.

Upcoming Disruptions ๐ŸŒ‹

  • Just like there are libraries of stock photos, there are libraries of stock-audio clips. And just like stock photos are seeing competition from AI, so will audio libraries.

  • Remember the end-to-end movie-production pipeline we proposed last week? Well, your sound effects just got way easier.

  • Podcasts, video games, and YouTube videos are going to become more immersive as everyone gets on-demand environment-building audio.

The narrator of your next audiobook is probably a robot ๐ŸŽ™๏ธ

Google just launched Lyra V2 "a better, faster, and more versatile speech codec". What the 'heck does that mean? It means your auto-generated audiobook narrator is about to sound a whole lot better.

Compare the results of Lyrva vs competitor-model Opus.

If you're not sure how to interpret that graph, we made a helpful companion:

TL;DR Lyra works better and faster. Check out the launch page to listen for yourself.

Bonus treats ๐Ÿช

Credits and Shoutouts ๐Ÿค˜

Thanks to everyone who's been sending us cool AI shit to include in the newsletter 

See something cool? Email it to us at [email protected] or by DM'ing us on Twitter @BotEatBrain

Until next time โœŒ๏ธ

๐Ÿค–๐Ÿ˜‹๐Ÿง