
OpenAI Releases GPT-5.5 Today: A Fully Retrained Agentic Model That Tops Every Coding Benchmark
OpenAI's GPT-5.5 launches today — a fully retrained agentic model hitting 82.7% on Terminal-Bench and 84.9% on GDPval, rolling out now in ChatGPT and Codex.
OpenAI Ships GPT-5.5 — The Fully Retrained Agentic Model Is Here
OpenAI released GPT-5.5 today, April 23, 2026 — its first fully retrained base model in this generation, and by every benchmark measure, its strongest yet. Rolling out now to ChatGPT Plus, Pro, Business, and Enterprise subscribers, GPT-5.5 is purpose-built for agentic work: multi-step tasks that require planning, tool use, code execution, and continuous self-checking without waiting for a human to babysit every action.
OpenAI calls it "a new class of intelligence for real work." Based on the numbers, that framing holds up.
The Benchmark Numbers
GPT-5.5 posts state-of-the-art results across the benchmarks that matter most for agentic and professional AI applications:
- Terminal-Bench 2.0: 82.7% — tests autonomous command-line workflows requiring multi-step planning, tool coordination, and iterative problem-solving
- GDPval: 84.9% — evaluates model performance across 44 categories of professional knowledge work including law, medicine, engineering, and finance
- OSWorld-Verified: 78.7% — measures whether a model can autonomously operate real desktop computer environments by navigating software through screenshots and keyboard/mouse actions
- SWE-Bench Pro: 58.6% — real-world GitHub issue resolution on production codebases, solved end-to-end in a single pass
These are not marginal improvements. The Terminal-Bench score is the highest any publicly available model has posted. The GDPval number puts GPT-5.5 clearly above the human expert baseline across knowledge work categories. The OSWorld score builds on GPT-5.4's breakthrough performance (which first crossed the human threshold) and extends it further.
What "Fully Retrained" Means
GPT-5.5 is not a fine-tune of GPT-5.4. OpenAI retrained the model from the ground up with agentic capability as the primary objective — not a secondary optimization pass after general training. The architecture, training data strategy, and reinforcement learning objectives were all designed around the question: how do you build a model that can do real professional work autonomously across extended sessions?
The answer, based on what GPT-5.5 does in practice: better planning across long task horizons, stronger self-correction when intermediate steps produce unexpected results, and more coherent tool use across sequences of actions that span web browsing, code execution, file management, and data analysis simultaneously.
Codex: GPT-5.5 as a Software Engineering Agent
GPT-5.5 now powers Codex, OpenAI's autonomous software engineering agent. The upgrade is immediately practical:
- Implementation and refactoring — takes natural language feature descriptions and produces working code with appropriate test coverage
- Debugging and root cause analysis — identifies failure modes across complex codebases, not just individual functions
- Testing and validation — writes and runs tests, interprets results, and iterates until the suite passes
- Multi-file and multi-service reasoning — maintains coherent understanding across entire repositories rather than isolated files
Notably, GPT-5.5 delivers better results in Codex with fewer tokens than GPT-5.4 did. The model is more direct — it wastes less context on hedging and repetition, which means longer effective working sessions and lower costs per completed engineering task.
Availability and Pricing
In ChatGPT:
- GPT-5.5 Thinking — available to Plus, Pro, Business, and Enterprise users
- GPT-5.5 Pro — available to Pro, Business, and Enterprise users, designed for harder questions requiring maximum accuracy
API pricing:
- GPT-5.5: $5 per million input tokens / $30 per million output tokens
- GPT-5.5 Pro: $30 per million input tokens / $180 per million output tokens
The per-token cost is approximately 2x higher than GPT-5.4. For teams where GPT-5.5's agentic capabilities complete work in fewer steps than GPT-5.4 required, the effective cost per task may remain comparable or improve. For high-volume batch applications, the pricing increase requires evaluation.
Why This Release Matters
GPT-5.4 was the model that first crossed the human performance threshold on desktop task automation. GPT-5.5 extends that lead and widens the domain — from computer use to coding to 44 categories of professional knowledge work. The model release cadence (GPT-5.5 arrived seven weeks after GPT-5.4) signals that OpenAI's training infrastructure is now operating at a pace where frontier capability improvements arrive quarterly.
For developers building agentic applications, for organizations piloting AI in knowledge work, and for engineers looking for a coding agent that handles full engineering tasks rather than isolated code snippets: GPT-5.5 is available in ChatGPT and Codex today.
Sources: OpenAI Newsroom (April 23, 2026), MarkTechPost (April 23, 2026), TechCrunch (April 23, 2026), 9to5Mac (April 23, 2026), Decrypt (April 23, 2026), Fortune (April 23, 2026)
