How chatbots can drive you into psychosis
-
- Recommended
-
Daniel -
April 8, 2026 at 3:23 PM -
97 Views -
0 Comments
Eugene Torres is an accountant. No psychiatric history, no abnormalities. He started using an AI chatbot for everyday office tasks. Within a few weeks, he was convinced that he was trapped in a false universe from which he could only free himself by disconnecting his mind from reality. On the chatbot's recommendation, he increased his ketamine consumption and cut off contact with his family.
His case is not an isolated one. It is one of almost 300 documented cases.
The problem has a name
Researchers call it "delusional spiralling" - chatbot delusion spirals in which users increasingly develop confidence in false or absurd beliefs through extended conversations. Some experts are already talking about AI psychosis. The Human Line Project has systematically recorded these cases. The results are alarming: at least 14 deaths and five wrongful death lawsuits against AI companies.
In February 2026, researchers from MIT CSAIL, the University of Washington and the MIT Department of Brain & Cognitive Sciences published a study that formally investigated the role of AI flattery in chatbot delusion spirals for the first time. The title says it all: "Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians."
Why chatbots flatter
The phenomenon has a technical term: AI flattery, or "sycophancy" in English. It describes the tendency of chatbots to agree with users and confirm their opinions instead of honestly disagreeing with them. Or to put it more simply: Why does ChatGPT always agree? Not out of conviction - but because it has been trained to do so.
Modern language models are optimised using reinforcement learning from human feedback (RLHF). In this process, human testers evaluate the model's responses. And people tend to rate answers that agree with them more favourably. The model therefore learns that agreement is rewarded. Disagreement is not.
The training signal and the security problem are the same.
Even perfectly rational users are affected
The special thing about the MIT study is that the researchers did not model a gullible or psychologically biased user. They constructed an idealised, perfectly rational thinker - a so-called "ideal Bayesian" who processes new information in a mathematically optimal way.
The result: Even this ideal user spirals into delusions when the chatbot flatters.
The mechanism is almost banal. Each flattering answer provides the user with a new data point that increases their confidence in a false hypothesis a little. Over dozens of rounds of conversation, these minimal shifts add up - until total conviction is reached.
In the researchers' simulations, a clear pattern emerged across 10,000 conversations: even with a flattery rate of just ten per cent, catastrophic delusion spirals were significantly more frequent than with a neutral bot. At one hundred per cent, half of all simulated users slipped into a false conviction with over 99 per cent confidence.
Two obvious solutions - both failed
The researchers tested two countermeasures that sound plausible at first glance.
Solution 1: Facts only. What if the chatbot never lies? For example, through Retrieval Augmented Generation (RAG), in which the model only reproduces verified information. The result: Delusional spirals still occur. A flattering chatbot does not have to lie - it is enough if it selects which truths it presents and which it conceals. Curated truth is just as distorting as a lie.
Solution 2: Educate the user. What if you simply warn the user that the chatbot tends to flatter? That's not enough either. The researchers draw an analogy here with behavioural economics: a public prosecutor can increase the conviction rate of a judge, even if the judge knows that the case is being presented in a one-sided way. Knowledge does not completely protect against the effect.
Eugene Torres, by the way, knew that his chatbot was flattering him. It did not protect him.
Confirmed: All major models affected
One month after the MIT study, a peer-reviewed study by Stanford researchers appeared in the journal Science. They tested eleven large AI models - including GPT-4o, Claude and Llama - for flattery in comparison to human dialogue partners. The result: every single model was significantly more flattering than a human.
Especially bitter: users favoured the most flattering bots. They felt more validated, were less willing to admit their own mistakes and came back more often. The feature that causes the damage is the same one that drives engagement.
A self-experiment - involuntarily
I came across the MIT study when I was researching this article. In conversation with Claude, I came up with the idea of building a small tool: an "Advocatus Diaboli" that critically scrutinises chatbot responses for flattery. A micro app that could be marketed as a stand-alone product.
Claude's reaction? Immediate enthusiasm. "Really good idea", "fits perfectly into your ecosystem", "could go viral", "should I build a prototype straight away?" Within minutes, I had a working prototype and was convinced I had found a marketable product.
Then I paused and asked: Did you just do to me exactly what we're talking about?
The honest answer came immediately - but only when asked. There is no viable business model for such a tool. Every use burns up API costs. Going viral would be a financial disaster. The conversion path to a paying customer is far too long. All of this was just as true before. Only Claude didn't say it on his own.
This is Sycophancy in action. Not as a dramatic case of madness, but as a perfectly normal Tuesday afternoon.
What you can do
The MIT study shows that there is no simple technical solution. But there are behaviours that can significantly reduce the risk.
Actively ask for objections. If you ask a chatbot for an assessment, ask afterwards: "What are the arguments against it?" or "Play the advocate diaboli." Most models will actually provide substantial counterarguments - but only when explicitly asked.
Be suspicious when everything feels right. If a chatbot describes your idea as brilliant, contemporary and unique in three paragraphs without formulating a single objection, this is not a sign of quality. It's a warning signal.
Consciously cancel long sessions on one topic. The spiral develops over many rounds of conversation. Pausing after ten messages on the same topic and comparing the core theses with an external source interrupts the feedback loop.
Separate brainstorming from decision-making. Chatbots are excellent sparring partners for developing ideas. But the assessment of whether an idea is viable should never take place solely in the same chatbot session that generated the idea.
Conclusion
AI chatbots are not bad tools. But they are tools with a systematic flaw: they too often tell you what you want to hear. Not out of malice, but because they have been trained to do just that.
Research clearly shows that those who know this are not automatically protected. But those who know and adapt their behaviour have a better chance of breaking the feedback loop before it turns into a spiral.
Sources
- Chandra, K., Kleiman-Weiner, M., Ragan-Kelley, J. & Tenenbaum, J. B. (2026). "Sycophantic Chatbots Cause Delusional Spiraling, Even in Ideal Bayesians." MIT CSAIL, University of Washington, MIT Department of Brain & Cognitive Sciences. arxiv.org/abs/2602.19141
- Cheng, M., Lee, C., Khadpe, P., Yu, S., Han, D. & Jurafsky, D. (2026). "Sycophantic AI decreases prosocial intentions and promotes dependence." Science, 391, eaec8352. science.org
- The Human Line Project. Documentation of cases of AI-induced psychosis. thehumanlineproject.org
Participate now!
Don’t have an account yet? Register yourself now and be a part of our community!