AI ModelsLumiChats Team·April 8, 2026·14 min read

OpenAI's Secret Weapon Has a Codename. It's Called 'Spud.' And It's Coming This Month.

OpenAI's next major model — internally codenamed 'Spud,' likely releasing as GPT-5.5 or GPT-6 — completed pretraining around March 24, 2026. Polymarket assigns 78% probability of release by April 30 and 95%+ by June 30. Sam Altman told employees it is a 'very strong model' that could 'really accelerate the economy.' It's being built as the backbone of a unified super-app. Here's every confirmed detail, how it compares to Claude Opus 4.6 and Gemini 3.1 Ultra, and what it means if you use ChatGPT every day.

1.4K students read·Share:
⚡ Quick Answer: GPT-5.5 (codename 'Spud') completed pretraining around March 24, 2026 and is currently in OpenAI's safety evaluation phase. Whether it will ship as GPT-5.5 or GPT-6 has not been confirmed — OpenAI has not committed to either naming publicly. Polymarket assigns 78% probability of release by April 30, and 95%+ probability by June 30. OpenAI CEO Sam Altman told employees the model is 'a very strong model that could really accelerate the economy.' OpenAI President Greg Brockman described it on the Big Technology podcast as representing 'two years of research' with a 'big model feel — it's not an incremental improvement.' It is being designed as the core engine of a unified ChatGPT super-app that collapses separate products — coding, research, agents, memory — into one platform. Competitive context: Gemini 3.1 Pro currently leads 13 of 16 major benchmarks per Google's own model card data. Claude Opus 4.6 leads on real-world coding evaluations. Spud needs to genuinely deliver to change that standing — and the benchmarks will tell us within days of release.

Why a Potato Codename Is the Most Important Thing in AI Right Now

Google uses desserts. Microsoft uses cities. OpenAI apparently uses root vegetables. The codename 'Spud' is a placeholder — confirmed by multiple sources including The Information, which first reported the pretraining completion on March 24, 2026 — that carries almost none of the strategic weight of what it represents. What it represents is this: for the first time since the original GPT-5 launch, OpenAI believes it has something that can genuinely re-establish benchmark dominance over both Anthropic and Google. That claim, if true, is significant. The past six months have been OpenAI's most competitive pressure period in the company's history. Gemini 3.1 Pro, released by Google in February 2026, leads 13 of 16 major AI benchmarks per Google's own model card. Claude Opus 4.6, released by Anthropic in the same month, consistently outperforms GPT-5.4 on real-world coding tasks in head-to-head developer testing. GPT-5.4, released March 5, 2026 — OpenAI's current flagship — is excellent and leads on knowledge work (83% GDPval) and computer use (75% OSWorld), but did not recapture the top position across all benchmarks that matter to the enterprise market. The internal pressure at OpenAI ahead of what appears to be an imminent IPO process created what sources describe as a 'Code Red' posture inside the company starting in December 2025. Spud is the response to that pressure.

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

  • Pretraining completed around March 24, 2026 — this is the single most important confirmed data point, first reported by The Information and subsequently confirmed by Sam Altman in a public statement to employees and Greg Brockman in a podcast interview. The time between pretraining completion and public release for recent OpenAI models has been 3-6 weeks for safety evaluation and red-teaming. If that window holds, Spud arrives between April 14 and May 5, 2026. Polymarket assigns 78% probability of release by April 30, and 95%+ by June 30.
  • The naming is unconfirmed: Whether Spud ships as GPT-5.5 or GPT-6 has not been decided publicly. OpenAI has said it will depend on how significant the performance leap is over GPT-5.4. If benchmarks show a generational improvement, expect GPT-6. If it's a strong but incremental advance, expect GPT-5.5. The codename 'Spud' was first reported by The Information on March 24, 2026 — consistent with how OpenAI names internal projects before formal product naming is settled.
  • The executive framing is extraordinary and worth citing precisely: Sam Altman told employees it is 'a very strong model that could really accelerate the economy.' OpenAI President Greg Brockman, in a public interview on the Big Technology podcast, described Spud as representing 'two years of research' with a 'big model feel — it's not an incremental improvement, it's a significant change in the way we think about model development.' These are public statements, not internal leaks, and they are the most aggressive language OpenAI has used about a pre-release model since GPT-5 itself.
  • It is being designed as the backbone of what OpenAI calls its 'unified super-app' — a product strategy that collapses ChatGPT, Codex, deep research, memory, and agent capabilities into a single platform. The rapid Codex CLI updates in early April 2026, including first-class plugins and multi-agent workflows, are widely interpreted as infrastructure preparation for Spud's deployment.
  • On the competitive benchmarks: GPT-5.4 scored 83% on GDPval, which tests real-world knowledge work across 44 professions. Gemini 3.1 Ultra leads 13 of 16 major benchmarks. Claude Opus 4.6 leads on real-world coding. Spud's internal framing suggests a target of recapturing the top position — but until benchmarks publish, that framing is self-reported.

What GPT-5.4 Already Does — And What Spud Will Need to Beat

To understand what Spud is trying to improve, you need to understand what GPT-5.4 already delivers — because the improvements are being built on a genuinely strong foundation. GPT-5.4, released March 5, 2026, introduced the largest context window OpenAI has ever shipped: 1 million tokens, competitive with Claude Opus 4.6's 1M context (in beta) and just below Gemini 3.1 Ultra's 2 million token window. It brought native computer-use capabilities to the API — agents can operate computers and execute complex workflows across applications without constant human intervention. It reduced factual errors in individual claims by 33% compared to GPT-5.2. And it introduced the ability to show an upfront plan while working, so users could redirect mid-response instead of waiting for a complete answer before correcting course.

The limitations that Spud reportedly addresses: GPT-5.4 did not achieve benchmark superiority over Gemini 3.1 Pro or coding dominance over Claude Opus 4.6. In extended agentic tasks — multi-hour autonomous workflows — GPT-5.4 shows reliability gaps that enterprise deployments have surfaced in Q1 2026. And the product is still perceived as fragmented: separate models for coding (GPT-5.3-Codex), reasoning (GPT-5.4 Thinking), and instant responses (GPT-5.3 Instant) create user confusion and context-switching overhead. Spud, as designed, would collapse those into a single model that handles all three use cases at a higher ceiling than any current version.

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

ModelContext WindowBenchmark Position (Mar 2026)Key StrengthKey Limitation
GPT-5.4 (Current OpenAI Flagship)1M tokens83% GDPval; 3rd on major benchmarksAgentic tool use, computer control, professional task execution across 44 job categoriesDoes not lead benchmarks; fragmented model lineup creates UX confusion; reliability gaps in long-horizon agent tasks
GPT-5.5 / GPT-6 'Spud' (Unreleased — April 2026)Unknown — likely ≥1M tokensUnverified; Altman's internal framing claims benchmark leadershipUnified super-app architecture; reportedly 'two years of research'; designed for deeper agentic integrationAll claims are internal/self-reported until benchmarks publish — 'macro-economic event' framing could be IPO narrative management
Claude Opus 4.6 (Anthropic)1M tokens (beta)Leads on real-world coding evaluations in developer testingBest-in-class coding and document analysis; strongest on reasoning tasks requiring sustained accuracy over long contextsNot leading across all categories; enterprise pricing at $25/user/month for Claude Team vs. ChatGPT Team's $30/user/month
Gemini 3.1 Ultra (Google)2M tokens — the largest availableGemini 3.1 Pro leads 13 of 16 benchmarks per Google's model card (March 2026); Ultra adds extended context and stronger multimodalLargest context window; native multimodal reasoning across text, image, audio, and video simultaneously without transcriptionSlower for everyday tasks compared to Instant-tier models; enterprise deployment via Vertex AI adds complexity for smaller teams

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

OpenAI's internal product strategy heading into 2026 is a bet on platform consolidation. The current ChatGPT experience — multiple models, separate entry points for research versus coding versus chat, a model picker that confusion-tested poorly with non-technical users — is being redesigned around a single experience that routes intelligently under the hood. The File Library launched in Q1 2026 moved ChatGPT toward a persistent workspace where documents, research, and files accumulate across sessions. Apple CarPlay integration brought ChatGPT into 100 million vehicles. Shopping and product comparison tools turned it toward commercial decision-making. Each of these features is more valuable with a more capable underlying model. Spud, as the planned backbone of this platform, is being positioned not just as 'better ChatGPT' but as the core intelligence layer of an ambient computing platform that follows you from car to desktop to phone. That is a fundamentally different ambition than any individual model release.

The IPO narrative cannot be ignored in this context. OpenAI, despite the removal of its nonprofit governance structure and billions in revenue, has not yet gone public. The internal urgency around Spud — 'Code Red,' 'macro-economic event,' 'two years of research' — arrives at a moment when the company has every incentive to release something that re-establishes technological leadership before any public market valuation event. That does not mean Spud is hype. It means that evaluating it honestly will require looking at what the benchmarks actually say, not at what Altman says about what the benchmarks will say. The distinction matters, and the AI community has been burned by pre-release framing before.

What Spud Means for You Depending on How You Use ChatGPT

  • If you use ChatGPT for everyday work tasks (writing, research, email, summaries): Spud will likely arrive as a transparent upgrade — you'll notice improved response quality, fewer factual errors, and better tool integration without changing anything you do. The unified architecture means less model-switching overhead. Expect GPT-5.4 Instant to feel faster and more accurate almost immediately post-release.
  • If you use ChatGPT for coding (Claude Opus 4.6 is currently superior): This is the head-to-head that matters most for developers. Spud is specifically targeting the coding gap. If Altman's internal claims hold in the benchmarks, this could be the first model since GPT-4 where OpenAI's flagship genuinely competes with Anthropic's best on code generation and debugging. If the benchmarks don't deliver on the framing, Claude remains the developer choice.
  • If you use Claude or Gemini and don't use ChatGPT: The competitive pressure from Spud will accelerate responses from Anthropic and Google. Claude Mythos — accidentally surfaced in internal Anthropic materials in early March — is described as 'a step change above the current Opus ceiling.' Google's Gemini 3.2 is in development. In the 30 days following Spud's release, expect announcements from all three companies. The AI user who benefits most from this moment is not the brand loyalist — it's the person who uses multiple models and picks the right one for each task.
  • If you're a business evaluating enterprise AI: The release timeline for Spud's enterprise API access will likely lag the consumer release by 2-4 weeks — consistent with the GPT-5.4 pattern. If your team is currently deciding between ChatGPT Team and Claude Team contracts, waiting until post-Spud benchmarks are public (likely by late April or May) will give you a significantly clearer decision framework than any data available today.
  • If you've never paid for an AI subscription: Spud, consistent with OpenAI's recent model rollout pattern, will likely be available to free ChatGPT users via the Thinking feature in the + menu within weeks of the Plus/Pro launch. OpenAI has repeatedly used free tier access to flagship models as a user acquisition tool. The gap between paid and free access will likely close to 2-4 weeks within the first month.

The Honest Uncertainty: What We Don't Know

The responsible framing here requires acknowledging what cannot be confirmed. Spud's actual capability improvements are not public. The naming — GPT-5.5 or GPT-6 — has not been decided. The 'two years of research' and 'big model feel' language comes from Greg Brockman's public podcast interview and Altman's internal all-hands — these are genuine signals, but every company has every incentive to describe its pre-release model optimistically, especially ahead of an IPO. The Polymarket 78% probability by April 30 is crowd-wisdom, not a guarantee. And 'benchmark leadership' as a goal does not translate directly to 'better for your specific use case' — GPT-5.4 already leads Claude and Gemini on some tasks despite trailing on aggregate benchmarks. The honest bottom line: Spud is almost certainly coming soon, is almost certainly a significant release, and carries extraordinary executive expectations that will be resolved by independent benchmarks within days of its public launch. Watch the benchmarks, not the framing.

📚 Read Next

Want to test ChatGPT and Claude head-to-head right now — before Spud changes everything? LumiChats gives you Claude Sonnet 4.6 and GPT-5.4 side by side, so you can run your own prompts and judge the difference yourself. The best way to understand what Spud will need to beat is to experience what it's competing with.

Found this useful? Share it with a friend 👇

Ready to study smarter?

Try LumiChats for 82¢/day

40+ AI models including Claude, GPT-5.4, and Gemini. Smart Study Mode with source-cited answers. Pay only on days you use it.

Get Started — 82¢/day

Keep reading

More guides for AI-powered students.