OpenAI's Secret Weapon Has a Codename. It's Called 'Spud.' And It's Coming This Month.

OpenAI's next major model — internally codenamed 'Spud,' likely releasing as GPT-5.5 or GPT-6 — completed pretraining around March 24, 2026. Polymarket assigns 78% probability of release by April 30 and 95%+ by June 30. Sam Altman told employees it is a 'very strong model' that could 'really accelerate the economy.' It's being built as the backbone of a unified super-app. Here's every confirmed detail, how it compares to Claude Opus 4.6 and Gemini 3.1 Ultra, and what it means if you use ChatGPT every day.

By Aditya Kumar Jha · April 8, 2026 · 14 min read · AI Models

⚡ Quick Answer: GPT-5.5 (codename 'Spud') completed pretraining around March 24, 2026 and is currently in OpenAI's safety evaluation phase. Whether it will ship as GPT-5.5 or GPT-6 has not been confirmed — OpenAI has not committed to either naming publicly. Polymarket assigns 78% probability of release by April 30, and 95%+ probability by June 30. OpenAI CEO Sam Altman told employees the model is 'a very strong model that could really accelerate the economy.' OpenAI President Greg Brockman described it on the Big Technology podcast as representing 'two years of research' with a 'big model feel — it's not an incremental improvement.' It is being designed as the core engine of a unified ChatGPT super-app that collapses separate products — coding, research, agents, memory — into one platform. Competitive context: Gemini 3.1 Pro currently leads 13 of 16 major benchmarks per Google's own model card data. Claude Opus 4.6 leads on real-world coding evaluations. Spud needs to genuinely deliver to change that standing — and the benchmarks will tell us within days of release.

Why a Potato Codename Is the Most Important Thing in AI Right Now

Google uses desserts. Microsoft uses cities. OpenAI apparently uses root vegetables. The codename 'Spud' is a placeholder — confirmed by multiple sources including The Information, which first reported the pretraining completion on March 24, 2026 — that carries almost none of the strategic weight of what it represents. What it represents is this: for the first time since the original GPT-5 launch, OpenAI believes it has something that can genuinely re-establish benchmark dominance over both Anthropic and Google. That claim, if true, is significant. The past six months have been OpenAI's most competitive pressure period in the company's history. Gemini 3.1 Pro, released by Google in February 2026, leads 13 of 16 major AI benchmarks per Google's own model card. Claude Opus 4.6, released by Anthropic in the same month, consistently outperforms GPT-5.4 on real-world coding tasks in head-to-head developer testing. GPT-5.4, released March 5, 2026 — OpenAI's current flagship — is excellent and leads on knowledge work (83% GDPval) and computer use (75% OSWorld), but did not recapture the top position across all benchmarks that matter to the enterprise market. The internal pressure at OpenAI ahead of what appears to be an imminent IPO process created what sources describe as a 'Code Red' posture inside the company starting in December 2025. Spud is the response to that pressure.

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

Pretraining completed around March 24, 2026 — this is the single most important confirmed data point, first reported by The Information and subsequently confirmed by Sam Altman in a public statement to employees and Greg Brockman in a podcast interview. The time between pretraining completion and public release for recent OpenAI models has been 3-6 weeks for safety evaluation and red-teaming. If that window holds, Spud arrives between April 14 and May 5, 2026. Polymarket assigns 78% probability of release by April 30, and 95%+ by June 30.
The naming is unconfirmed: Whether Spud ships as GPT-5.5 or GPT-6 has not been decided publicly. OpenAI has said it will depend on how significant the performance leap is over GPT-5.4. If benchmarks show a generational improvement, expect GPT-6. If it's a strong but incremental advance, expect GPT-5.5. The codename 'Spud' was first reported by The Information on March 24, 2026 — consistent with how OpenAI names internal projects before formal product naming is settled.
The executive framing is extraordinary and worth citing precisely: Sam Altman told employees it is 'a very strong model that could really accelerate the economy.' OpenAI President Greg Brockman, in a public interview on the Big Technology podcast, described Spud as representing 'two years of research' with a 'big model feel — it's not an incremental improvement, it's a significant change in the way we think about model development.' These are public statements, not internal leaks, and they are the most aggressive language OpenAI has used about a pre-release model since GPT-5 itself.
It is being designed as the backbone of what OpenAI calls its 'unified super-app' — a product strategy that collapses ChatGPT, Codex, deep research, memory, and agent capabilities into a single platform. The rapid Codex CLI updates in early April 2026, including first-class plugins and multi-agent workflows, are widely interpreted as infrastructure preparation for Spud's deployment.
On the competitive benchmarks: GPT-5.4 scored 83% on GDPval, which tests real-world knowledge work across 44 professions. Gemini 3.1 Ultra leads 13 of 16 major benchmarks. Claude Opus 4.6 leads on real-world coding. Spud's internal framing suggests a target of recapturing the top position — but until benchmarks publish, that framing is self-reported.

What GPT-5.4 Already Does — And What Spud Will Need to Beat

To understand what Spud is trying to improve, you need to understand what GPT-5.4 already delivers — because the improvements are being built on a genuinely strong foundation. GPT-5.4, released March 5, 2026, introduced the largest context window OpenAI has ever shipped: 1 million tokens, competitive with Claude Opus 4.6's 1M context (in beta) and just below Gemini 3.1 Ultra's 2 million token window. It brought native computer-use capabilities to the API — agents can operate computers and execute complex workflows across applications without constant human intervention. It reduced factual errors in individual claims by 33% compared to GPT-5.2. And it introduced the ability to show an upfront plan while working, so users could redirect mid-response instead of waiting for a complete answer before correcting course.

The limitations that Spud reportedly addresses: GPT-5.4 did not achieve benchmark superiority over Gemini 3.1 Pro or coding dominance over Claude Opus 4.6. In extended agentic tasks — multi-hour autonomous workflows — GPT-5.4 shows reliability gaps that enterprise deployments have surfaced in Q1 2026. And the product is still perceived as fragmented: separate models for coding (GPT-5.3-Codex), reasoning (GPT-5.4 Thinking), and instant responses (GPT-5.3 Instant) create user confusion and context-switching overhead. Spud, as designed, would collapse those into a single model that handles all three use cases at a higher ceiling than any current version.

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

Model	Context Window	Benchmark Position (Mar 2026)	Key Strength	Key Limitation
GPT-5.4 (Current OpenAI Flagship)	1M tokens	83% GDPval; 3rd on major benchmarks	Agentic tool use, computer control, professional task execution across 44 job categories	Does not lead benchmarks; fragmented model lineup creates UX confusion; reliability gaps in long-horizon agent tasks
GPT-5.5 / GPT-6 'Spud' (Unreleased — April 2026)	Unknown — likely ≥1M tokens	Unverified; Altman's internal framing claims benchmark leadership	Unified super-app architecture; reportedly 'two years of research'; designed for deeper agentic integration	All claims are internal/self-reported until benchmarks publish — 'macro-economic event' framing could be IPO narrative management
Claude Opus 4.6 (Anthropic)	1M tokens (beta)	Leads on real-world coding evaluations in developer testing	Best-in-class coding and document analysis; strongest on reasoning tasks requiring sustained accuracy over long contexts	Not leading across all categories; enterprise pricing at $25/user/month for Claude Team vs. ChatGPT Team's $30/user/month
Gemini 3.1 Ultra (Google)	2M tokens — the largest available	Gemini 3.1 Pro leads 13 of 16 benchmarks per Google's model card (March 2026); Ultra adds extended context and stronger multimodal	Largest context window; native multimodal reasoning across text, image, audio, and video simultaneously without transcription	Slower for everyday tasks compared to Instant-tier models; enterprise deployment via Vertex AI adds complexity for smaller teams

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

OpenAI's internal product strategy heading into 2026 is a bet on platform consolidation. The current ChatGPT experience — multiple models, separate entry points for research versus coding versus chat, a model picker that confusion-tested poorly with non-technical users — is being redesigned around a single experience that routes intelligently under the hood. The File Library launched in Q1 2026 moved ChatGPT toward a persistent workspace where documents, research, and files accumulate across sessions. Apple CarPlay integration brought ChatGPT into 100 million vehicles. Shopping and product comparison tools turned it toward commercial decision-making. Each of these features is more valuable with a more capable underlying model. Spud, as the planned backbone of this platform, is being positioned not just as 'better ChatGPT' but as the core intelligence layer of an ambient computing platform that follows you from car to desktop to phone. That is a fundamentally different ambition than any individual model release.

The IPO narrative cannot be ignored in this context. OpenAI, despite the removal of its nonprofit governance structure and billions in revenue, has not yet gone public. The internal urgency around Spud — 'Code Red,' 'macro-economic event,' 'two years of research' — arrives at a moment when the company has every incentive to release something that re-establishes technological leadership before any public market valuation event. That does not mean Spud is hype. It means that evaluating it honestly will require looking at what the benchmarks actually say, not at what Altman says about what the benchmarks will say. The distinction matters, and the AI community has been burned by pre-release framing before.

What Spud Means for You Depending on How You Use ChatGPT

If you use ChatGPT for everyday work tasks (writing, research, email, summaries): Spud will likely arrive as a transparent upgrade — you'll notice improved response quality, fewer factual errors, and better tool integration without changing anything you do. The unified architecture means less model-switching overhead. Expect GPT-5.4 Instant to feel faster and more accurate almost immediately post-release.
If you use ChatGPT for coding (Claude Opus 4.6 is currently superior): This is the head-to-head that matters most for developers. Spud is specifically targeting the coding gap. If Altman's internal claims hold in the benchmarks, this could be the first model since GPT-4 where OpenAI's flagship genuinely competes with Anthropic's best on code generation and debugging. If the benchmarks don't deliver on the framing, Claude remains the developer choice.
If you use Claude or Gemini and don't use ChatGPT: The competitive pressure from Spud will accelerate responses from Anthropic and Google. Claude Mythos — accidentally surfaced in internal Anthropic materials in early March — is described as 'a step change above the current Opus ceiling.' Google's Gemini 3.2 is in development. In the 30 days following Spud's release, expect announcements from all three companies. The AI user who benefits most from this moment is not the brand loyalist — it's the person who uses multiple models and picks the right one for each task.
If you're a business evaluating enterprise AI: The release timeline for Spud's enterprise API access will likely lag the consumer release by 2-4 weeks — consistent with the GPT-5.4 pattern. If your team is currently deciding between ChatGPT Team and Claude Team contracts, waiting until post-Spud benchmarks are public (likely by late April or May) will give you a significantly clearer decision framework than any data available today.
If you've never paid for an AI subscription: Spud, consistent with OpenAI's recent model rollout pattern, will likely be available to free ChatGPT users via the Thinking feature in the + menu within weeks of the Plus/Pro launch. OpenAI has repeatedly used free tier access to flagship models as a user acquisition tool. The gap between paid and free access will likely close to 2-4 weeks within the first month.

The Honest Uncertainty: What We Don't Know

The responsible framing here requires acknowledging what cannot be confirmed. Spud's actual capability improvements are not public. The naming — GPT-5.5 or GPT-6 — has not been decided. The 'two years of research' and 'big model feel' language comes from Greg Brockman's public podcast interview and Altman's internal all-hands — these are genuine signals, but every company has every incentive to describe its pre-release model optimistically, especially ahead of an IPO. The Polymarket 78% probability by April 30 is crowd-wisdom, not a guarantee. And 'benchmark leadership' as a goal does not translate directly to 'better for your specific use case' — GPT-5.4 already leads Claude and Gemini on some tasks despite trailing on aggregate benchmarks. The honest bottom line: Spud is almost certainly coming soon, is almost certainly a significant release, and carries extraordinary executive expectations that will be resolved by independent benchmarks within days of its public launch. Watch the benchmarks, not the framing.

📚 Read next: 'GPT-5.4 vs Claude Opus 4.6 vs DeepSeek V4 vs Gemini 3.1: The March 2026 Comparison' · 'SuperGrok vs ChatGPT Plus vs Claude Pro: Which $20-$30/Month AI Is Worth Paying For?' · 'Claude Sonnet 4.6 vs GPT-5.4: The Honest March 2026 Comparison'. Want to test ChatGPT and Claude head-to-head right now — before Spud changes everything? LumiChats gives you Claude Sonnet 4.6 and GPT-5.4 side by side, so you can run your own prompts and judge the difference yourself. The best way to understand what Spud will need to beat is to experience what it's competing with.

Insight

Why a Potato Codename Is the Most Important Thing in AI Right Now

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

Pretraining completed around March 24, 2026 — this is the single most important confirmed data point, first reported by The Information and subsequently confirmed by Sam Altman in a public statement to employees and Greg Brockman in a podcast interview. The time between pretraining completion and public release for recent OpenAI models has been 3-6 weeks for safety evaluation and red-teaming. If that window holds, Spud arrives between April 14 and May 5, 2026. Polymarket assigns 78% probability of release by April 30, and 95%+ by June 30.
The naming is unconfirmed: Whether Spud ships as GPT-5.5 or GPT-6 has not been decided publicly. OpenAI has said it will depend on how significant the performance leap is over GPT-5.4. If benchmarks show a generational improvement, expect GPT-6. If it's a strong but incremental advance, expect GPT-5.5. The codename 'Spud' was first reported by The Information on March 24, 2026 — consistent with how OpenAI names internal projects before formal product naming is settled.
The executive framing is extraordinary and worth citing precisely: Sam Altman told employees it is 'a very strong model that could really accelerate the economy.' OpenAI President Greg Brockman, in a public interview on the Big Technology podcast, described Spud as representing 'two years of research' with a 'big model feel — it's not an incremental improvement, it's a significant change in the way we think about model development.' These are public statements, not internal leaks, and they are the most aggressive language OpenAI has used about a pre-release model since GPT-5 itself.
It is being designed as the backbone of what OpenAI calls its 'unified super-app' — a product strategy that collapses ChatGPT, Codex, deep research, memory, and agent capabilities into a single platform. The rapid Codex CLI updates in early April 2026, including first-class plugins and multi-agent workflows, are widely interpreted as infrastructure preparation for Spud's deployment.
On the competitive benchmarks: GPT-5.4 scored 83% on GDPval, which tests real-world knowledge work across 44 professions. Gemini 3.1 Ultra leads 13 of 16 major benchmarks. Claude Opus 4.6 leads on real-world coding. Spud's internal framing suggests a target of recapturing the top position — but until benchmarks publish, that framing is self-reported.

Also on LumiChats

AI Models

GPT-5.5 Is Live. The People Who Just Switched Think They Got Smarter. Some of Them Are Now More Confidently Wrong.

8 min read→

AI Models

GPT-5.5 vs Claude: The Mistake Everyone Will Make This Week

11 min read→

AI Models

MiMo V2 Pro vs Claude Opus 4.7: Best Budget AI Model 2026

16 min read→

What GPT-5.4 Already Does — And What Spud Will Need to Beat

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

Model	Context Window	Benchmark Position (Mar 2026)	Key Strength	Key Limitation
GPT-5.4 (Current OpenAI Flagship)	1M tokens	83% GDPval; 3rd on major benchmarks	Agentic tool use, computer control, professional task execution across 44 job categories	Does not lead benchmarks; fragmented model lineup creates UX confusion; reliability gaps in long-horizon agent tasks
GPT-5.5 / GPT-6 'Spud' (Unreleased — April 2026)	Unknown — likely ≥1M tokens	Unverified; Altman's internal framing claims benchmark leadership	Unified super-app architecture; reportedly 'two years of research'; designed for deeper agentic integration	All claims are internal/self-reported until benchmarks publish — 'macro-economic event' framing could be IPO narrative management
Claude Opus 4.6 (Anthropic)	1M tokens (beta)	Leads on real-world coding evaluations in developer testing	Best-in-class coding and document analysis; strongest on reasoning tasks requiring sustained accuracy over long contexts	Not leading across all categories; enterprise pricing at $25/user/month for Claude Team vs. ChatGPT Team's $30/user/month
Gemini 3.1 Ultra (Google)	2M tokens — the largest available	Gemini 3.1 Pro leads 13 of 16 benchmarks per Google's model card (March 2026); Ultra adds extended context and stronger multimodal	Largest context window; native multimodal reasoning across text, image, audio, and video simultaneously without transcription	Slower for everyday tasks compared to Instant-tier models; enterprise deployment via Vertex AI adds complexity for smaller teams

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

What Spud Means for You Depending on How You Use ChatGPT

If you use ChatGPT for everyday work tasks (writing, research, email, summaries): Spud will likely arrive as a transparent upgrade — you'll notice improved response quality, fewer factual errors, and better tool integration without changing anything you do. The unified architecture means less model-switching overhead. Expect GPT-5.4 Instant to feel faster and more accurate almost immediately post-release.
If you use ChatGPT for coding (Claude Opus 4.6 is currently superior): This is the head-to-head that matters most for developers. Spud is specifically targeting the coding gap. If Altman's internal claims hold in the benchmarks, this could be the first model since GPT-4 where OpenAI's flagship genuinely competes with Anthropic's best on code generation and debugging. If the benchmarks don't deliver on the framing, Claude remains the developer choice.
If you use Claude or Gemini and don't use ChatGPT: The competitive pressure from Spud will accelerate responses from Anthropic and Google. Claude Mythos — accidentally surfaced in internal Anthropic materials in early March — is described as 'a step change above the current Opus ceiling.' Google's Gemini 3.2 is in development. In the 30 days following Spud's release, expect announcements from all three companies. The AI user who benefits most from this moment is not the brand loyalist — it's the person who uses multiple models and picks the right one for each task.
If you're a business evaluating enterprise AI: The release timeline for Spud's enterprise API access will likely lag the consumer release by 2-4 weeks — consistent with the GPT-5.4 pattern. If your team is currently deciding between ChatGPT Team and Claude Team contracts, waiting until post-Spud benchmarks are public (likely by late April or May) will give you a significantly clearer decision framework than any data available today.
If you've never paid for an AI subscription: Spud, consistent with OpenAI's recent model rollout pattern, will likely be available to free ChatGPT users via the Thinking feature in the + menu within weeks of the Plus/Pro launch. OpenAI has repeatedly used free tier access to flagship models as a user acquisition tool. The gap between paid and free access will likely close to 2-4 weeks within the first month.

OpenAI's Secret Weapon Has a Codename. It's Called 'Spud.' And It's Coming This Month.

Why a Potato Codename Is the Most Important Thing in AI Right Now

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

What GPT-5.4 Already Does — And What Spud Will Need to Beat

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

What Spud Means for You Depending on How You Use ChatGPT

The Honest Uncertainty: What We Don't Know

OpenAI's Secret Weapon Has a Codename. It's Called 'Spud.' And It's Coming This Month.

Why a Potato Codename Is the Most Important Thing in AI Right Now

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

What GPT-5.4 Already Does — And What Spud Will Need to Beat

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

What Spud Means for You Depending on How You Use ChatGPT

The Honest Uncertainty: What We Don't Know

Claude, GPT-5.4, Gemini —
all in one place.

Keep reading

Why a Potato Codename Is the Most Important Thing in AI Right Now

Everything Confirmed About Spud: The Timeline, the Architecture, the Claims

What GPT-5.4 Already Does — And What Spud Will Need to Beat

How Spud Compares to the Current Frontier: Claude Opus 4.6 and Gemini 3.1 Ultra

The Super-App Strategy: Why Spud Is More Than a Model Upgrade

What Spud Means for You Depending on How You Use ChatGPT

The Honest Uncertainty: What We Don't Know

Claude, GPT-5.4, Gemini —all in one place.

Keep reading

Claude, GPT-5.4, Gemini —
all in one place.