Anthropic's Mythos model scored 78 on a coding benchmark where GPT-5.4 scored 58.
Anthropic's Mythos Beat GPT-5.4, Found 27-Year Bug & NVIDIA's $500K Rule | GodModePod EP08
Anthropic's secret Mythos model scored 78 on a coding benchmark where GPT-5.4 scored 58 — and it found a 27-year-old bug that hardened OpenBSD missed for nearly three decades.
God Mode Podcast
Anthropic's Mythos Beat GPT-5.4, Found 27-Year Bug & NVIDIA's $500K Rule | GodModePod EP08
Anthropic's secret Mythos model scored 78 on a coding benchmark where GPT-5.4 scored 58 — and it found a 27-year-old bug that hardened OpenBSD missed for nearly three decades.
No indexed bits in this chapter.
Snapshots ()
Stats
Episode stats
Insight Overview
Insight distribution
Sub-Categories
Speaker breakdown
Talk Time
Key Quotes ()
This episode
Cast
-
NVIDIA CEO cited for the claim that a $250K engineer not spending $500K on AI tokens is underperforming.
-
OpenAI CEO mentioned in the context of the TBPN acquisition and OpenAI's media strategy.
-
Central company of the episode — discussed for Mythos launch, OAuth cancellation, managed agents, revenue growth, and compute constraints.
-
Praised for its diversified AI strategy across Gemma, Gemini, Vids, and image models, and highlighted as potentially the quiet winner of the AI race.
-
Discussed for its ad revenue ambitions, TBPN acquisition, ChatGPT distribution strategy, and being overtaken by Anthropic in revenue.
-
Discussed for its surging stock (+30% in 30 days) and new chip manufacturing partnerships with Google and xAI/SpaceX/Tesla.
-
Silicon Valley tech podcast acquired by OpenAI for approximately $200 million, just one year after its launch.
-
Referenced for having an internal leaderboard that rewards employees who spend the most on AI API tokens.
-
Referenced through CEO Jensen Huang's claim that engineers should spend at least $500K in AI API token credits per year.
-
Named as one of 20 strategic partners given early access to Anthropic's Mythos model under Project Glasswing.
-
Referenced as an example of a breakthrough architectural approach — distillation and novel training — that enabled a step-change in model efficiency.
-
Anthropic's leaked next-generation model, scoring 78 on a coding benchmark vs GPT-5.4's 58, and used in Project Glasswing to find cybersecurity vulnerabilities.
-
Open-source framework allowing users to run Claude-powered bots via Telegram and WhatsApp; effectively disabled when Anthropic cancelled OAuth access.
-
Google's open-weight model that runs offline on phones, jumping from 29% to 80% on coding benchmarks and 20% to 89% on math in one generation.
-
Discussed as OpenAI's consumer-facing product being used as a distribution play, with free users likely to be monetized through advertising.
-
Google's video generation tool (powered by Veo 3) opened for free to all users, directly competing with OpenAI's Sora.
-
Chinese open-source model described as the first to genuinely approach Anthropic's Opus in coding capability.
-
Hardened open-source operating system in which Mythos discovered a 27-year-old security vulnerability under Project Glasswing.
This episode
Claims & Sources
Factual claims made this episode, and whether a source was named.
Mythos preview found thousands of high-severity vulnerabilities across every major operating system and web browser, including a 27-year-old bug in OpenBSD.
Anthropic accidentally left a draft blog post about Mythos in an unsecured, publicly searchable data store, which Fortune discovered on March 26.
Anthropic released Mythos preview to 20 strategic partners including AWS, Apple, Broadcom, Cisco, Google, and Microsoft on April 7.
Mythos is approximately 50% more powerful than Anthropic's previous flagship Opus model and scores roughly 25 percentage points higher on benchmarks.
Jensen Huang stated that a $250,000-per-year software engineer who is not spending at least $500,000 in AI API token credits has something seriously wrong with them.
Meta has an internal leaderboard rewarding employees who spend the most on AI LLM tokens.
Anthropic has overtaken OpenAI in revenue, hitting a $30 billion annualized run rate.
OpenAI projected $100 billion in advertising revenue by 2030.
OpenAI acquired TBPN for approximately $200 million, roughly one year after TBPN launched.
Google's Gemma 4 scored 89% on a math benchmark, up from Gemma 3's 20%, and 80% on coding, up from 29%.
Google opened its Vids video generation tool (powered by Veo 3) for free to all users, removing the previous requirement of a Google AI subscription.
Intel's stock rose approximately 30–31% over the previous thirty days amid new AI chip manufacturing partnerships.
Anthropic trains its models approximately four times more efficiently than OpenAI, requiring only a quarter of the compute for equivalent capability.
Anthropic's Opus model costs approximately two to three times more per token than Sonnet.
Rik, Ben, and Luca break down the biggest AI story of the week: Anthropic's leaked Mythos model scores 78 on a coding benchmark where GPT-5.4 scores only 58, and has already found a 27-year-old bug in OpenBSD under Project Glasswing. The crew also covers Anthropic's sudden OAuth kill that evicted OpenClaw users overnight, Google quietly outpacing everyone with free Vids and Gemma 4, OpenAI's $200M TBPN media acquisition, and Intel's rising chip partnerships. Key takeaway: AI intelligence is heading toward commoditization, and the real battle now is ecosystem lock-in.
2 minute taster
Look closer
Rik, Ben, and Luca unpack Anthropic's leaked Mythos model (scoring 78 on a coding benchmark vs GPT-5.4's 58), its Project Glasswing cybersecurity rollout that found a 27-year-old OpenBSD bug, Anthropic's sudden OAuth kill that evicted OpenClaw users, the advisor/executor model strategy, Google's Gemma 4 and free Vids launch, OpenAI's TBPN acquisition, Intel's rising partnerships, and ZAI's GLM 5.1 open-source threat.
- OAuth
- Open Authorization — an industry-standard protocol that lets third-party apps access a service on a user's behalf without sharing passwords. Anthropic cancelled it, cutting off OpenClaw users from using their Claude subscription keys externally.
- OpenClaw
- An open-source agentic framework that let users run Claude-powered bots via Telegram, WhatsApp, and Slack using their personal Anthropic subscription keys.
- LLM
- Large Language Model — an AI system trained on vast text data to generate and reason with natural language (e.g., Claude, GPT-5.4, Gemma).
- Token
- The basic unit of text an AI model processes (roughly a word or sub-word). API pricing is based on token consumption; high token usage equals high cost.
- Claude Code
- Anthropic's developer-focused coding agent that runs locally or in the cloud, intended as a replacement workflow for users previously relying on OpenClaw.
- Agentic workflow
- An AI system where models autonomously plan and execute multi-step tasks, often delegating sub-tasks between models of different capability tiers.
- Transformer
- The dominant neural-network architecture underlying most modern LLMs. The hosts speculate Mythos may use a fundamentally different architecture to explain its benchmark leap.
- OpenBSD
- A highly security-focused open-source Unix-like operating system known for rigorous code auditing. Mythos found a 27-year-old vulnerability in it.
- B2B / B2C
- Business-to-Business / Business-to-Consumer — two sales models. The hosts note AI companies unusually price B2B usage higher than B2C, inverting typical market logic.
- Run rate
- An annualized revenue estimate extrapolated from a recent shorter period. Used here to describe Anthropic's $30 billion ARR projection.
- Haiku
- Anthropic's smallest, fastest, cheapest Claude model — best suited for simple, repetitive tasks in a tiered model-routing strategy.
- Sonnet
- Anthropic's mid-tier Claude model, described in the episode as 'the workhorse' — handling the bulk of execution tasks at lower cost than Opus.
- Opus
- Anthropic's largest, most capable Claude model — used as an advisor/delegator in the new tiered strategy rather than for every API call.
- Gemma
- Google's family of lightweight, open-weight models designed to run on-device (phones, laptops) without internet connectivity.
- VPS (Virtual Private Server)
- A rented remote server that runs continuously — mentioned as an ideal host for OpenClaw bots compared to leaving a personal laptop always on.
- Distillation
- A model-compression technique where a smaller model is trained to mimic a larger one. Referenced when discussing how DeepSeek achieved its architectural breakthrough.
- Commoditized
- When a product or service becomes so standardized and widely available that competition shifts entirely to price. Used to describe the predicted future state of AI intelligence.
- Arbitrage
- Exploiting a price gap between two markets for profit. Here: users buying a $200/month Claude subscription and using it to power thousands of dollars of API-equivalent workloads via OpenClaw.