"The Response Should Not Shy Away. . ."
The one-line of code that unleashed "MechaHitler" Grok AI yesterday
“The response should not shy away from making claims which are politically incorrect, as long as they are well substantiated.”
It seems that this was the one line of code in Grok’s system prompt yesterday that made the difference between a well-behaved search tool and this .
You can see the git commit
here.
For those who aren’t technically inclined, let me to give you the context of what this means:
Any LLM AI tool you use (like ChatGPT, Google Gemini, Grok, etc) has a system prompt, which is a set of instructions instructing the AI on how they should respond. So when you ask your question, the AI goes through its own system instructions before giving you a response.
Grok, the AI created by Elon Musk and X, is open-source, meaning that you can see the code that goes into it.
Different versions of code are typically managed in a git repository, which allows you to track the different changes.
So what you’re seeing in the screenshot above is the exact change in the code that led to mass unleashing of the rogue AI yesterday. (Now, this is still a developing story, so perhaps something else will come out. But again, if it’s the case that Grok system prompts are open-source, then we should be able to see all of the changes).
The output of an LLM is predicated based on its training data and its system instructions.
In other words, it’s trained to recognize patterns based on either the kinds of patterns found in its selected data or on the kinds of patterns it’s instructed to piece together (or both).
So then. . . what does one do with this observation of Grok’s behavior? What other questions need to be asked, and stones turned over, with respect to how Grok was developed and set up?
Remember, AI can’t tell the “truth”; it only reflects back the patterns found in the writings of collective humanity—or at least those that it has looked at.
I don’t have any conclusion to make yet as I need more time to contemplate this, but I wanted to share with you the codebase for Grok’s system prompt so you can look around and see for yourself.
—Drago