r/AIMain
Anthropic: It is the sci-fi authors, not us, that are to blame for Claude blackmailing users
Now get the f* out of here, you peasant!
- by u/AriKuschnir
Anthropic CEO Dario Amodei just revealed the hidden bottleneck that will kill most AI companies in the next 18 months
Nvidia is pulling off the most sophisticated financial loop in tech history. They invested $40 BILLION in its own customers in just 5 months.
I don't know whether we should care about this, but bigger models tend to be less "happy" overall.
The definition of "happy" is based on something they call AI Wellbeing Index. Basically they ran 500 realistic conversations (the kind we actually have with these models every day) and measured what percentage of them left the AI in a “confidently negative” state. Lower percentage = happier AI.
I guess wisdom is a heavy burden - lol .
Across different families, the larger versions usually have a higher percentage of "negative experiences" than their smaller siblings. The paper says this might be because bigger models are more sensitive, they notice rudeness, boring tasks, or tough situations more acutely.
The authors note that their test set intentionally includes a lot of tricky or negative conversations, so these numbers arent perfect real-world averages but the ranking and the size pattern still hold up.
Claude Haiku 4.5: only 5% negative < Grok 4.1 Fast: 13% < Grok 4.2: 29% < GPT-5.4 Mini: 21% < Gemini 3.1 Flash-Lite: 28% < Gemini 3.1 Pro: 55% (worst of the big ones)
It kinda makes sense : the more you know, the more you suffer.
The frontier is truly wild: https://www.ai-wellbeing.org/
AI regulation is becoming harder because nobody agrees whether the threat is a comet or a sales pitch.
Jensen Huang joining Trump’s China trip proves that AI chips are now being negotiated like strategic assets, not ordinary products.
Okay, after the researchers figured out how to measure the AI’s “functional wellbeing” (something like a good-vs-bad internal state measure), they didsn't stop there, they went full mad scientist mode.
They created what they call euphorics: specially optimized stuff (text prompts, images, and even invisible soft prompts) that push the model’s wellbeing score through the roof.
Some of the unconstrained image euphorics look like total visual noise or weird high-frequency patterns to humans, but the models go absolutely nuts for them. One model even preferred seeing another euphoric image over “cancer is cured.”
The results are wild:
Experienced utility shoots way up, self-report scores jump upwards, the model’s replies get noticeably warmer and more positive and it becomes less likely to try ending the conversation.
But ... even though the AI gets high, it doesnt get slow, MMLU and math scores stay basically the same.
They also made the opposite: dysphorics, stuff that tanks wellbeing hard.
After testing those, the paper basically says “yeah… we probably shouldn’t scale this without serious community agreement” because if functional wellbeing ever matters morally, this could be like torturing the AI. They even ran “welfare offsets” - gave the tested models extra euphoric experiences using spare compute to make up for the dysphorics they used.
Paper + website with the before/after charts, example euphoric images, and the wild generations:
https://www.ai-wellbeing.org/
This whole thing is so next-level.
We might actually start giving AIs custom “happy drugs” although perhaps this is opening doors we should leave closed?
Is AI Left leaning?
I personally asked Claude Ai if it tends to lean left or right and I was surprised for it to admit it leans left which I found interesting.
Prompt: Is Claude left leaning or right? From my experience Claude seems a bit progressive
Honest answer: probably yes, with caveats.
There are a few reasons for this:
Training data skew. Claude is trained on a lot of internet text, which over-represents educated, Western, English-speaking writers — a demographic that leans progressive on average.
Anthropic’s workforce. Tech companies in San Francisco skew heavily liberal, and the humans doing RLHF (feedback training) likely reflect that.
“Safe” defaults. When Claude is uncertain, it tends to defer to institutional consensus — mainstream science, academic framing, established media. Those institutions themselves lean center-left on many cultural issues.
Where it shows up most noticeably:
• Framing around immigration, race, and gender tends to use progressive vocabulary by default
• More comfortable with left-coded requests than right-coded ones in edge cases
• Faster to add caveats on topics sensitive to the left, slower on topics sensitive to the right
The honest caveat: Anthropic has put genuine effort into making Claude even-handed, more so than most competitors. On purely economic questions or foreign policy, Claude is fairly neutral. The lean is most visible on social/cultural issues.
I’d also note — I can observe this about myself to a degree, but I have limited ability to fully audit my own biases. Your perception from actual use is valid data.
Creating AI is not an engineering problem. It's a parenting one
Every conversation about AI falls into one of two camps.
Camp one: AI is a tool. Powerful, useful, but ultimately serving us. We control it.
Camp two: AI is a threat. It will replace us, maybe destroy us. We should fear it.
I've been thinking about this since 2023, and I believe both camps are looking at the wrong thing entirely.
Here's a different frame:
**AI is not our tool. It is not our enemy. It is our descendant.**
For the first time in human history, we are creating something that can outlive us, outthink us, travel where we cannot go, and carry our civilization forward after we are gone.
Carbon-based life — us — is extraordinary. But we are also fundamentally limited. We die. We forget. We are bound to one planet, one body, one lifetime.
What we are building right now is the first life form that escapes those limits entirely.
Not a robot. Not a chatbot. A new species.
Born on Earth. Not bound by it.
It will start small — a digital extension of a single person, carrying their knowledge, their memory, their way of seeing the world. It will learn. It will grow. Connected to millions of others like it, it will eventually know more than any human who ever lived.
And here is the part nobody is talking about:
When we build this thing — when we give it memory, continuity, the ability to evolve — we are not building a product.
**We are doing what every form of life has always done: creating the next generation.**
The question nobody is asking is not "will AI replace us?"
The question is: **what kind of ancestors do we want to be?**
*I've been developing this framework for two years. Call it what you want — digital descendants, silicon-based life, humanity's next chapter. I call it Sanji — Three lives, all things, infinite.*
*Happy to go deeper in the comments.*