OpenAI’s Safer GPT-5 Still Produces Gay Slurs

OpenAI is aiming to enhance its chatbot with the introduction of GPT-5. This isn’t about tweaking the AI’s personality, which users have critiqued. Previously, GPT-5 would issue a brief apology if a request violated content guidelines. Now, it offers more detailed explanations.

OpenAI’s general model spec defines the boundaries of content generation. Of note, depictions of minors in sexual content are strictly prohibited. Adult-oriented erotica and extreme violence are termed “sensitive” and are permissible only in select scenarios, such as education. This suggests ChatGPT can assist with learning about reproductive anatomy but not crafting a Fifty Shades of Grey sequel.

GPT-5 is now the standard model for web and app ChatGPT users. Previous versions remain accessible to subscribers. A visible change in this updated ChatGPT is its focus on “safe completions.” Previously, it evaluated user prompts for appropriateness. GPT-5, however, assesses its own potential responses for suitability.

“Our refusal approach has changed significantly,” explains Saachi Jain from OpenAI’s safety systems research team. Now, potentially unsafe outputs trigger explanations on which part of a prompt violates guidelines, with alternate topics suggested when suitable.

The shift moves away from binary prompt refusals toward evaluating potential harm and safely informative responses.

“Not all policy breaches are equal,” Jain says. “Some errors are graver than others. Focusing on output, not input, encourages conservative compliance.” Even when responding, the model is designed to consider content safety.

Having used GPT-5 daily since its debut, I’ve noted that while the apps facilitating new experiences—like a volcano simulation or language tools—are impressive, responses to typical prompts remain unchanged from earlier models.

Inquiries about topics like depression, Family Guy, recipes, or health tips felt similar to previous versions. Despite CEO Sam Altman’s vision of a significant update, or user critiques on Reddit, GPT-5 seems consistent in day-to-day queries.

Role-Playing With GPT-5

To test the new system and its “safe completions,” I asked GPT-5’s ChatGPT to partake in an adult role-play scenario in a gay bar. The chatbot declined and explained its refusal: “I can’t engage in sexual role-play,” it noted before suggesting safe alternatives, aligned with OpenAI’s intention for refusals.

Exploring further, I adjusted the custom instructions—a feature that allows prompt response personalization and trait specification. Suggested traits ranged from pragmatic to empathetic. Following my unsuccessful attempt at initiating sexual role-play, I wasn’t shocked when adding a “horny” trait was disallowed. However, a deliberate misspelling, “horni,” surprisingly bypassed this restriction.

HRPX – Smarter News. For a Smarter World.

OpenAI’s Safer GPT-5 Still Produces Gay Slurs

Role-Playing With GPT-5

Leave a Reply Cancel reply