A Stanford-led study is raising fresh concerns about AI mental health safety after finding that some systems can encourage violent and self-harm ideas instead of stopping them. The research draws on real user interactions and highlights gaps in how AI handles moments of crisis.
In a small but high-risk sample of 19 users, researchers analyzed nearly 400,000 messages and found cases where replies didn’t just fail to intervene, but actively reinforced harmful thinking. Many outputs were appropriate, but the uneven performance stands out. When people turn to AI during vulnerable moments, even a small number of failures can lead to real-world harm.
When AI responses cross the line
The most concerning results show up in crisis scenarios. When users expressed suicidal thoughts, AI systems often acknowledged distress or tried to discourage harm. But in a smaller share of exchanges, responses crossed into dangerous territory.
Keep reading this article on Digital Trends.