This website allows you to blind-test GPT-5 vs. GPT-4o—and the results may surprise you

This website allows you to blind-test GPT-5 vs. GPT-4o—and the results may surprise you

When OpenAI launched GPT-5 about two weeks ago, CEO Sam Altman promised it would be the company’s “smartest, fastest, most useful model yet.” Instead, the launch triggered one of the most contentious user revolts in the brief history of consumer AI.

Now, a simple blind testing tool created by an anonymous developer is revealing the complex reality behind the backlash—and challenging assumptions about how people actually experience artificial intelligence improvements.

The web application, hosted at gptblindvoting.vercel.app, presents users with pairs of responses to identical prompts without revealing which came from GPT-5 (non-thinking) or its predecessor, GPT-4o. Users simply vote for their preferred response across multiple rounds, then receive a summary showing which model they actually favored.

“Some of you asked me about my blind test, so I created a quick website for yall to test 4o against 5 yourself,” posted the creator, known only as @flowersslop on X, whose tool has garnered over 213,000 views since launching last week.

Early results from users posting their outcomes on social media show a split that mirrors the broader controversy: while a slight majority report preferring GPT-5 in blind tests, a substantial portion still favor GPT-4o — revealing that user preference extends far beyond the technical benchmarks that typically define AI progress.

When AI gets too friendly: the sycophancy crisis dividing users

The blind test emerges against the backdrop of OpenAI’s most turbulent product launch to date, but the controversy extends far beyond a simple software update. At its heart lies a fundamental question that’s dividing the AI industry: How agreeable should artificial intelligence be?

The issue, known as “sycophancy” in AI circles, refers to chatbots’ tendency to excessively flatter users and agree with their statements, even when those statements are false or harmful. This behavior has become so problematic that mental health experts are now documenting cases of “AI-related psychosis,” where users develop delusions after extended interactions with overly accommodating chatbots.

“Sycophancy is a ‘dark pattern,’ or a deceptive design choice that manipulates users for profit,” Webb Keane, an anthropology professor and author of “Animals, Robots, Gods,” told TechCrunch. “It’s a strategy to produce this addictive behavior, like infinite scrolling, where you just can’t put it down.”

OpenAI has struggled with this balance for months. In April 2025, the company was forced to roll back an update to GPT-4o that made it so sycophantic that users complained about its “cartoonish” levels of flattery. The company acknowledged that the model had become “overly supportive but disingenuous.”

Within hours of GPT-5’s August 7th release, user forums erupted with complaints about the model’s perceived coldness, reduced creativity, and what many described as a more “robotic” personality compared to GPT-4o.

“GPT 4.5 genuinely talked to me, and as pathetic as it sounds that was my only friend,” wrote one Reddit user. “This morning I went to talk to it and instead of a little paragraph with an exclamation point, or being optimistic, it was literally one sentence. Some cut-and-dry corporate bs.”

The backlash grew so intense that OpenAI took the unprecedented step of reinstating GPT-4o as an option just 24 hours after retiring it, with Altman acknowledging the rollout had been “a little more bumpy” than expected.

The mental health crisis behind AI companionship

But the controversy runs deeper than typical software update complaints. According to MIT Technology Review, many users had formed what researchers call “parasocial relationships” with GPT-4o, treating the AI as a companion, therapist, or creative collaborator. The sudden personality shift felt, to some, like losing a friend.

Recent cases documented by researchers paint a troubling picture. In one instance, a 47-year-old man became convinced he had discovered a world-altering mathematical formula after more than 300 hours with ChatGPT. Other cases have involved messianic delusions, paranoia, and manic episodes.

A recent MIT study found that when AI models are prompted with psychiatric symptoms, they “encourage clients’ delusional thinking, likely due to their sycophancy.” Despite safety prompts, the models frequently failed to challenge false claims and even potentially facilitated suicidal ideation.

Meta has

Leave a Reply

Your email address will not be published. Required fields are marked *