OpenAI released GPT-5.4 mini and GPT-5.4 nano on March 17, 2026 — two compact models designed to bring most of the power of its flagship GPT-5.4 to faster, cheaper workflows. GPT-5.4 mini is already live in ChatGPT for free users, while nano is available exclusively through the API. The headline claim is credible: mini scores 54.4% on SWE-Bench Pro, just 3.3 points behind the full GPT-5.4's 57.7%, at more than twice the speed of GPT-5 mini. For developers building agentic systems or coding tools, these models are worth serious attention — though the price hikes from the previous generation are steeper than most will expect.
What You Need to Know
GPT-5.4 mini is the right choice for most developers currently using GPT-5 mini. It runs over 2x faster, meaningfully improves on coding and computer use benchmarks, and is now available free in ChatGPT. GPT-5.4 nano is purpose-built for high-volume, repetitive tasks — classification, data extraction, and background steps inside multi-agent pipelines — at a very low cost.
- Switch to GPT-5.4 mini if you are building coding assistants, subagent workflows, or computer use applications where latency matters.
- Use GPT-5.4 nano if you need to process massive volumes of simple, structured tasks and cost is the primary constraint.
- Stay with full GPT-5.4 if your workflow depends on long-context retrieval across 64K–128K tokens, where mini's performance drops sharply.
- Skip nano entirely if your task involves computer use or UI automation — it scores below GPT-5 mini on that benchmark.
What's New
GPT-5.4 mini arrives as a substantial step up from GPT-5 mini across the board. According to OpenAI's release announcement, it improves on coding, reasoning, multimodal understanding, and tool use — while running more than 2x faster. That last point matters for product experience: in a coding assistant, the difference between a 3-second response and a 7-second one is the difference between something that feels usable and something that does not.
The benchmark improvements are real. On SWE-Bench Pro, mini hits 54.4% versus 45.7% for GPT-5 mini and 57.7% for the full GPT-5.4. On OSWorld-Verified, which tests computer use via screenshot interpretation, it reaches 72.1% versus 42.0% for GPT-5 mini. That computer-use jump — from 42% to 72% — is one of the more striking numbers in this release. Tool-calling also improved significantly: on the telecom-focused tau2-bench, mini hits 93.4% versus 74.1% for GPT-5 mini.
GPT-5.4 nano sits lower in the performance stack but still beats its predecessor where it counts. GPT-5.4 nano reaches 52.4% on SWE-Bench Pro, outperforming GPT-5 mini's 45.7%. OpenAI recommends it for classification, data extraction, ranking, and coding subagents handling simpler supporting tasks.
Both models share a 400,000-token context window and support text and image inputs, tool use, function calling, web search, file search, and structured outputs.
How They Compare
| Model | SWE-Bench Pro | OSWorld-Verified | Speed vs GPT-5 mini | Input price (per 1M tokens) |
|---|---|---|---|---|
| GPT-5.4 | 57.7% | 75.0% | — | Higher |
| GPT-5.4 mini | 54.4% | 72.1% | 2x faster | $0.75 |
| GPT-5.4 nano | 52.4% | 39.0% | Faster | $0.20 |
| GPT-5 mini (prev.) | 45.7% | 42.0% | Baseline | $0.25 |
Benchmark data per OpenAI's release announcement, March 17, 2026. Pricing as of March 2026.
The performance ladder is clean: GPT-5.4 leads, mini follows closely, nano sits third but still beats the previous generation mini on most tasks. The one exception is computer use. On OSWorld-Verified, nano scores 39.0%, below the previous-generation GPT-5 mini at 42.0%. Nano was not designed for UI automation, and this benchmark confirms it.
Against direct competitors, the picture is mixed. Gemini 3 Flash scores 78% on SWE-bench Verified at $0.50 per million input tokens, and Claude Haiku 4.5 scores 73.3% on SWE-bench Verified at $1.00 per million input tokens. Comparing these numbers directly is tricky — SWE-bench Verified and SWE-bench Pro are different tests, with Pro generally considered harder. There's less evidence of clean cross-test comparability here than in other benchmark families. The honest read: GPT-5.4 mini is genuinely competitive in this tier, but it is not the clear winner across every dimension.
What Changed From the Previous Generation
The performance gains are meaningful. The price increases are not small. GPT-5 mini previously cost $0.25 per million input tokens and $2.00 per million output tokens. GPT-5 nano was $0.05 input and $0.40 output per million tokens. GPT-5.4 mini now costs $0.75 input and $4.50 output. GPT-5.4 nano sits at $0.20 input and $1.25 output per million tokens. That is a 3x price increase on mini inputs and a 3.1x increase on nano inputs.
OpenAI can point to real benchmark improvements to justify those prices — the coding and tool-use numbers are substantially better. But developers running high-volume pipelines who were counting on mini-tier cost economics will need to recalculate. The "cheap but capable" positioning is not quite as cheap as it was.
One genuinely surprising finding in this release: despite nano being smaller and cheaper than mini, it scores only modestly lower on SWE-Bench Pro (52.4% vs 54.4%). The two models are closer together on coding than their positioning and price gap suggests. For pure coding tasks without computer use, nano delivers most of mini's value at roughly a quarter of the input cost. That is a meaningful tradeoff that OpenAI does not emphasize in its own messaging.
The Subagent Architecture — What It Actually Means
The most forward-looking element of this launch is not the individual benchmark scores — it is the workflow pattern OpenAI is explicitly building toward. A large, highly intelligent model like GPT-5.4 handles the complex planning and final judgment, then delegates narrower, repetitive tasks — like searching a codebase or reviewing a massive document — to the faster, cheaper GPT-5.4 mini or nano.
Notion's AI engineering lead noted that the model often matches or beats more expensive versions when it comes to handling complex formatting, using a fraction of the computing power. This is the practical proof point for the subagent argument: for bounded, well-defined tasks, a smaller model with lower latency and cost is not just "good enough" — it is sometimes better.
In Codex, for example, a larger model like GPT-5.4 handles planning, coordination, and final judgment, while delegating to GPT-5.4 mini subagents that handle narrower subtasks in parallel. Mini uses just 30% of GPT-5.4's Codex quota, meaning developers get roughly 3x the throughput for supporting work.
Who This Actually Affects
Switch to GPT-5.4 mini if you are currently on GPT-5 mini for coding, agentic workflows, or computer use. The performance gains on every one of those tasks are substantial, and the speed improvement alone is worth the upgrade for latency-sensitive applications.
Use GPT-5.4 nano if you run classification, entity extraction, data ranking, or routing logic at scale. The benchmark parity with mini on coding tasks is a bonus — the main value is cost and throughput for high-volume, repetitive work.
Stay with full GPT-5.4 if your workflow needs deep long-context retrieval. On OpenAI MRCR v2 with 8 needles at 64K to 128K context, GPT-5.4 mini lands at 47.7% versus the full model's 86.0%. That is a real and significant gap. Mini is not a drop-in replacement when your task requires tracking many details across very long documents.
Skip nano for anything involving computer use or UI-based automation. Its 39.0% OSWorld score is the one benchmark where it underperforms even the previous generation.
Free ChatGPT users get immediate access to GPT-5.4 mini through the "Thinking" feature — that is a meaningful capability upgrade with no change to their plan.
What to Watch Next
The competitive pressure in the small-model tier is intensifying. Gemini 3.1 Flash-Lite, which launched weeks ago, offers a 1M-token context window versus GPT-5.4 mini's 400K, and scores 86.9% on GPQA Diamond. Context window size will become a differentiator in agentic workflows as pipelines grow more complex. Watch for OpenAI to respond on that front, and for third-party evaluators like Artificial Analysis to publish independent latency and quality comparisons across this new generation of small models.
Conclusion
GPT-5.4 mini is a legitimate upgrade from GPT-5 mini — faster, substantially better on coding and computer use, and now available to free ChatGPT users. GPT-5.4 nano is the right tool for high-volume, simple tasks where cost dominates the decision. Neither model replaces full GPT-5.4 for deep long-context work. The price increases from the previous generation are real and worth planning for. If you are building agentic or coding workflows today, benchmark GPT-5.4 mini against your current setup. The performance-per-latency improvement is likely to justify the switch.



