Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk’s tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.
Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It’s like thinking someone is an expert mechanic because they can drive a car well.
“This blogger” is Simon Willison, who has been doing LLM benchmarks and other LLM-related things since before it was cool
Not a random substack grifter
Is my comment wrong though? Another possibility is that Grok is given an example of searching for Elon Musk’s tweets when it is presented with the available tool calls. Just because it outputs the system prompt when asked does not mean that we are seeing the full context, or even the real system prompt.
Posting blog guides on how to code with ChatGPT is not expertise on LLMs. It’s like thinking someone is an expert mechanic because they can drive a car well.