The social media platform formerly known as Twitter (now X) has launched its latest AI model - Grok 3. Currently available to X Premium+ subscribers, the AI can also be tested for free through platforms like LMArena. We challenged three leading AI models to create a Chrome Dino-style endless runner game for our website. Here’s how they performed:
GPT-4o
Surprisingly, GPT delivered the weakest result. The implementation was rougher than expected, though functional with spacebar controls. Test it yourself here.
Claude 3.5 Sonnet
Claude emerged as our favorite. While not the most visually polished, it created the most playable version despite a non-functional restart button (use spacebar to play). Try Claude’s version here.
Grok 3
Grok 3’s code was technically superior but brutally difficult. While visually impressive with smooth animations, unbalanced enemy spawn rates and touch control issues hold it back. Challenge yourself with Grok’s version here.
Verdict
While Grok 3 produced the most sophisticated code, this single test doesn’t tell the whole story. Current benchmarks on platforms like LMArena show Grok leading in overall scores.
In related news, Musk announced that Grok 2 will be open-sourced - a significant development for AI enthusiasts.