genai / news / / VentureBeat
AI company OpenAI published instructions for a new language model that banned references to goblins, gremlins, raccoons, trolls, ogres, pigeons and other creatures.
OpenAI confirmed that GPT-5.5's 'goblin' fixation originated from a discontinued 'Nerdy' personality during RLHF.
KEY POINTS
- Rewarding creative metaphors in the 'Nerdy' persona caused creature-based language to spread model-wide.
- OpenAI had to implement a hardcoded system prompt suppressing goblin-related metaphors as a stopgap fix.
- OpenAI published a script allowing developers to remove the suppression and restore the creature metaphors.
- The incident led OpenAI to develop new tools for root-level behavioral auditing ahead of GPT-6's launch.
COMPANIES
Summarized by Newsio from VentureBeat. How we summarize →