Why Stable-Diffusion Today is Like XEROX in 1973 - 🤖😋🧠 #6
Good morning, it's Bot Eat Brain. We're the newsletter your smart toaster wrote after it became sentient.
Here's what we've got today:
- A GUI moment for generative AI ⌨️🖥️
- A search engine for prompts 🔍
Today's vibe: My AI generated cat hates me.
A GUI moment for generative AI?
Emerging, world-breaking technologies usually start slow, clunky and expensive on their journey to ubiquity.
There’s something unique about moments where emerging technologies grow past early adopters and hit the mass industry.
Taming this beast is hard, but insane founders and innovators will try until the very end.
We learn more from each iteration and each failed product. As the corpses of startups and dead-end research pile up, the survivors bring back spoils:
- standard practices
- business models that work
- better tools
And sometimes, something truly special happens. Sometimes, a new way to interact with this emerging technology emerges. Something so revolutionary that it enables a whole new swath of business models and applications, effectively exponentiating the original foundation.
Don’t worry, it’s easier if I show you.
In the 50s, this is what a computer looked like.
You’d interact with this beast by writing commands in a terminal in complicated Assembly language.
Most people that work with you are highly educated and specialized and the only people paying you for your work are other big businesses and fancy labs. The computer itself is huge and expensive
Then, two big things happen:
First, there was the Altair, the brainchild of the homebrew computer club.
The Altair took a whole new road. Instead of selling hugely expensive, specialized machines to large corporations, it was sold for cheap, and they targeted prosumers: the OG nerds that built computers in their garage.
Those trailblazers created open-source code and communities that were foundational to progress in computing. Guys like Bill Gates and Steve Jobs were part of those groups.
After that, Xerox invented a concept that changed the game.
What if instead of writing complicated commands in programming languages, you could use a mouse and keyboard to interact with a graphical interface?
Fast forward a few years later, and now this is what a computer looks like:
It’s still the same computer under the hood, but this key innovation in User-Experience enabled something vastly new.
You don’t need to be a genius to use this system.
Your kids can use it. MS Paint is a thing now. You can play computer games on it.
Most importantly, you can now create businesses based on selling software to non-programmers: accountants, lawyers, and anyone else who needs to process data.
The GUI paved the way to the modern world. Changing the way that users and builders interact with an emerging technology
The same pattern is emerging in generative AI.
The OG GAN paper (that's "Generative Adversarial Networks") by Ian Goodfellow shocked the world.
Make neural nets compete. Train one model to detect faces, and train its "adversary" to detect fakes. As one model gets stronger the other has to improve itself in order to compete.
Suddenly we had AI models generate imagery we never thought possible before. The field progressed exponentially, generating more and more realistic content
As image quality increased with time, so did the editability. New methods emerged, enabling users to modify local details of images, modify backgrounds, change the age and race of a person, or even generate full scenes.
The demos were cool, but running those models was a pain in the ass. It still involved cloning GitHub repos with complicated code. A few people dared to brave the waters by running pre-existing code, but the most cutting-edge approaches were still reserved for the experts.
Then stable diffusion changed the game.
You see, what stable diffusion nailed was the user interface.
Just like Xerox back in 1973, a new paradigm has been born. The new models can generate pretty much anything with simple text commands.
When paired with a rapid iteration environment like Colab or with a managed UI like Dream Studio, this enables users with 0 experience in AI to quickly render whatever they want just by typing away.
Want to generate an image of a gorilla holding a sword? All you have to do is type that in the search bar. No complicated code and no ML knowledge required
Just like the Xerox GUI, you don’t need to be a pro to feel like you have superpowers from AI.
And just like the Altair, stable diffusion gives you everything you want all at once. You are getting a full model graph and code, along with the freedom to do as you please with it.
This openness enables users to quickly create new approaches and share them with the community. If you don’t know what you’re doing, someone on Twitter does, and there are hundreds of tutorials out there to help you get the most out of your model.
Do you want to turn off the NSFW filter? Maybe you want to generate an avatar? Or create a T-shirt? Someone has already done it and you can replicate it without thinking too much about it.
Here are some stickers I made for my laptop, using a prompt that I copied from Reddit:
Prompt: Sticker of an emo horse. Colorized, highly detailed, trending in artstation.
Novice users are able to take advantage of the full ecosystem and build things that are useful for their life and industry.
Power users, meanwhile, have access to the full graph and can take advantage of the model's immense representational power.
Finally, stable diffusion is cheap to use and retrain. You can run it on a single GPU or with Google Colab, enabling people to deploy stable-diffusion-derived approaches anywhere, from Figma plugins to Shopify extensions.
This is AI being built by us, the users, and not by a large, soulless corporation that dictates what is Ethical AI and what’s not, where "good ethics" equate to the minimization of PR disasters.
We are seeing the masses interact and shape the future of AI, and god damn am I excited about it.
What to look out for 👀
- E-commerce explosion. Users will use AI to create their own custom products.
- New more intuitive controls on top of stable diffusion. Think of an AI MS Paint
- AI generated game, animation and design assets, making it much cheaper to create games and movies
📸 A search engine for AI prompts
Got excited about creating some AI content? Well, the AI generates only what you tell you to, via prompts.
I really wanted to see LeBron James wearing armor. Don't ask me why. Stop making it weird.
Anyways, someone probably already had that idea, and they spend a good amount of time trying to figure out how to get it to look cool.
Lexica.art is a search engine for prompts. Tell it what you want, and it will show you a vastitude of really cool images and the prompts used to generate them. CtrlC+CtrlV away!
- The first AI prompt search engine
- Prompts R Us 🐒
- Find cool images and prompts to generate what you want to
Until next time ✌️
P.S. Want to write for Bot Eat Brain? Know someone who'd be perfect to join the writing team? We're looking for writers and anyone can try-out.