u/Daviid1982

▲ 0 r/AIMain

AI Is Taking Your Job. But AI Was Trained on Your Data.

AI Is Taking Your Job.
But AI Was Trained on Your Data.
 
Part One: What AI Is Actually Doing
This is not a distant prediction. It is happening now.
AI is genuinely doing valuable things. It helps doctors read scans faster, helps engineers debug more efficiently, and gives more people access to information at lower cost. None of that needs to be denied. But the efficiency gains that technology produces, and who captures the returns from those gains, are two separate questions.
According to McKinsey Global Institute's 2024 report, approximately 30% of existing work tasks globally will be automated by 2030. This is not confined to a single industry — it is a structural displacement occurring across almost every sector simultaneously.
 
Legal assistant work (contract review, document sorting)  80% automatable by AI
Basic programming and code debugging  60%+ already handled by AI tools
Customer service, translation, basic writing  Large-scale automation already underway
Preliminary medical imaging analysis  AI accuracy now exceeds some human specialists
Accounting report generation, financial analysis  Entry-level positions disappearing rapidly
 
These roles share one thing in common: they were once considered work that required education. They are the professions people spent four years — or more — and significant money to study.
An AI tool can now complete most of that work in seconds.
 
Part Two: How AI Was Trained
Before asking who is responsible, we need to answer a more fundamental question: where does AI's capability actually come from?
The answer is not some genius engineer. It is not a tech company's private laboratory. It is you.
The Source of Training Data
Large language models like GPT, Gemini, and Claude acquire their capabilities by reading and learning from vast quantities of human-written text. That text includes:
• Everything you have posted on social media — every tweet, Facebook post, Reddit comment, and forum reply
• Every question and answer you have written on knowledge platforms like Stack Overflow, Quora, or Wikipedia
• The images and videos you have uploaded — used to train image recognition and generation models
• Your search history and click behavior — used to train recommendation algorithms and intent prediction models
• Your purchases, reviews, and returns — used to train consumer behavior and demand forecasting models
OpenAI acknowledged in 2023 that the GPT series was trained on hundreds of billions of web pages. The text on those pages was written overwhelmingly by ordinary users — not by OpenAI employees.
Humanity collectively wrote for decades. AI absorbed it all in a few years.
This Is Not Just Data. It Is Your Labor.
A common response is: "I just posted a few comments online — that doesn't count for much."
Individually, that is true. But aggregate the content of billions of people, across decades, across countless platforms, and the result is a complete map of human knowledge, language patterns, and professional judgment.
That map is the true source of AI's capability. Without it, AI is nothing.
Every time you write anything online, you are making an unpaid contribution to the AI training corpus.
 
Part Three: The Most Ironic Loop in Economic History
Now put two facts side by side:
Consider three specific people first.
A graphic designer uploaded hundreds of pieces of work to the internet over the past decade. Those works were used to train AI image generation models. Now clients generate visuals with AI and skip her entirely.
A programmer answered thousands of questions on Stack Overflow over the years. Those exchanges are among the training sources for GitHub Copilot and ChatGPT. Now entry-level coding work is being absorbed by AI — and his past contributions became raw material for compressing his own market.
A copywriter's articles and notes have accumulated across platforms for years. AI writing tools learned from that content. Now clients tend to use AI to generate a first draft before deciding whether a human is needed at all.
Three people. Three fields. One structure: their work trained the system that is replacing them, and that system does not belong to them.
You contributed data. AI learned your job from your data. AI replaced your job. You lost income. You spent what remained to purchase AI services. The AI company collected the revenue. Your data continued training the next generation of AI. The loop continues.
This is not a conspiracy theory. It is a structural description of how the current digital economy operates.
Here are the numbers that make the loop concrete:
 
OpenAI valuation (2024)  ~$157 billion
Training data used for GPT-4  ~13 trillion tokens, mostly from the public web
Compensation paid to the creators of that data  Zero
Monthly active users of ChatGPT  Over 200 million
Value of training data generated by each conversation  Uncalculated. Uncompensated.
 
Microsoft has invested over $13 billion in OpenAI. Google, Amazon, and Meta have collectively committed hundreds of billions of dollars to AI infrastructure.
The returns on those investments flow from a system built on data that billions of users contributed for free.
Investors received their returns. The contributors of the data received nothing.
 
Part Four: This Is Not a Technology Problem. It Is an Ownership Problem.
At this point, someone might say: "This is how technological progress works. Every major technology revolution has displaced old jobs while creating new ones."
That observation has some validity. But it evades a critical distinction:
In previous technology revolutions, the workers who were displaced did not provide the raw material for the machines that displaced them.
Steam engines replaced textile hand-workers, but those workers did not build the steam engines. Factory assembly lines replaced traditional craftsmen, but craftsmen did not donate their designs to the factories for free.
The AI revolution is different. The people being displaced by AI are precisely the people who trained AI. This has no precedent in technological history.
The Technology Itself Is Neutral
The same AI technology can exist in two entirely different configurations:
Configuration One (current): AI is owned by a small number of companies → companies use user data to train AI → AI displaces users' jobs → companies profit, users absorb the cost.
Configuration Two: AI is collectively owned by those who contributed the data → user data trains AI → AI displaces some tasks → the productivity gains are distributed proportionally among all contributors.
The technology is identical. The capability of AI is identical. The displacement of tasks is identical.
The only difference is who owns the system.
Whoever owns it decides where the value it generates flows.
Ownership Determines Outcomes
The logic is straightforward. If you own a factory, the factory's profits belong to you. If you do not own it, you can only work for it — or be replaced by it.
The underlying logic of the digital economy is identical. The "factory" has simply become the platform and the AI system. The "workers' labor" has become users' data and attention.
The current arrangement is: platforms and AI systems are owned by a small number of shareholders; users' data and attention are input for free; the value generated belongs to the shareholders.
This arrangement was not determined by technology. It was a choice. It can be a different choice.
 
Part Five: A Question Worth Asking
Before the internet, consumers were scattered and isolated. Each person faced a market controlled by a small number of producers and distributors, with no real capacity to organize collectively.
After the internet, that condition changed. For the first time in history, billions of people have infrastructure that allows them to connect with one another, verify shared interests, and act together.
This means that, for the first time in history, consumers as a whole have sufficient scale and tools to ask a question that could never previously have been posed:
Why don't the systems we use every day belong to us?
Regardless of where we come from, what language we speak, or what we believe — we are all being recorded, analyzed, and predicted by the same systems, and we all constitute the foundation on which those systems operate. That is not an identity. That is a shared condition.
The internet changed something that was previously very difficult: large numbers of people can now connect, collaborate, and contribute value continuously — at low cost, across distance.
If more and more people begin to recognize that a system's value comes from the long-term contributions of all its participants, then the rules around data, AI, and the distribution of returns may start to be discussed differently.
 
This question has no simple answer. It involves fundamental redesign of technology, law, and economic structure.
But the question itself is worth asking.
Because a question that is never asked will never have an answer.
 
One simple fact, to close:
AI's capability comes from the accumulated knowledge of humanity.
That accumulation took thousands of years.
Who it belongs to is a question every person can think about.
─────────────────────────────────────
Sources: McKinsey Global Institute — The Future of Work in the Age of AI (2024) | OpenAI technical documentation | Goldman Sachs — AI and the Global Labor Market | Bloomberg Intelligence 2024

reddit.com
u/Daviid1982 — 2 days ago