reverse of TLDR is RDLT. Double irony — while TLDRs are long notes, RDLTs are short. RDLT also stands for RD’s Little Thoughts. Or something cheesy like that. This short note is the first in that series, so bear with me as I find my rhythm.
As of 2023, most creative work is original human creation, be it writings, code, paintings, photos, music, videos, even 3D worlds. You name it.
We’re also seeing a rapid rise in machine-generated creative work through LLMs, AI-based Image and Music Generators, etc. All these AI models are trained primarily on human creations, largely scraped from the Internet — books, comments, code, articles, songs, photos, paintings. But N years from now, the Internet would inevitably be a mix of human creation and machine creation. In fact, if things go as planned, AI-generated content would become indistinguishable from human-generated content.
In fact, humans will race to create AI that makes AI vs human generated content progressively indistinguishable, while smaller sets of researchers building models that attempt to tell them apart would like lose the game.
Why do I think the latter is a losing proposition? Because there’s not enough money to be made getting really good at making this distinction. Both motivation and research funding is ultimately driven by economics.
AI assistance is everywhere, from cars to canvases. But there’s a distinction — with cars we are moving toward full self driving, thus turning driving into an unnecessary skill. But with AI assisted creativity, we are not really trying to turn humans creativity into an unncessary skill. But I think we are approaching these two machine learning problems very similarly.
Putting all this together, my key point in this RDTL #1 is the following:
As GenAI models of the future train on increasingly bigger proportions of AI-generated creations relative to human creations, would the models get stuck climbing the proverbial local hill, stay in a local optima, unable to escape the feedback loop, and thereby stymie creativity?
As the tools get better, more people will use them to create content, and so the production of AI-generated content will further accelerate, and be available on the Internet for AI models to train on. And it goes on and on. So my current hypothesis is — yes it will, unless we are deliberate about not getting there.
One solution might be to inject sufficient stochasticity and explict modeling to help GenAI deliberately mimic human creativity, i.e. optimize for creating things no human or machine has ever produced.
Thoughts?