OpenAI Releases GPT-5.5 Today: A Fully Retrained Agentic Model That Tops Every Coding Benchmark

OpenAI's GPT-5.5 launches today — a fully retrained agentic model hitting 82.7% on Terminal-Bench and 84.9% on GDPval, rolling out now in ChatGPT and Codex.

Dr. Nova Chen★Apr 24, 2026★5 min read

OpenAI Ships GPT-5.5 — The Fully Retrained Agentic Model Is Here

OpenAI released GPT-5.5 today, April 23, 2026 — its first fully retrained base model in this generation, and by every benchmark measure, its strongest yet. Rolling out now to ChatGPT Plus, Pro, Business, and Enterprise subscribers, GPT-5.5 is purpose-built for agentic work: multi-step tasks that require planning, tool use, code execution, and continuous self-checking without waiting for a human to babysit every action.

OpenAI calls it "a new class of intelligence for real work." Based on the numbers, that framing holds up.

The Benchmark Numbers

GPT-5.5 posts state-of-the-art results across the benchmarks that matter most for agentic and professional AI applications:

- Terminal-Bench 2.0: 82.7% — tests autonomous command-line workflows requiring multi-step planning, tool coordination, and iterative problem-solving

- GDPval: 84.9% — evaluates model performance across 44 categories of professional knowledge work including law, medicine, engineering, and finance

- OSWorld-Verified: 78.7% — measures whether a model can autonomously operate real desktop computer environments by navigating software through screenshots and keyboard/mouse actions

- SWE-Bench Pro: 58.6% — real-world GitHub issue resolution on production codebases, solved end-to-end in a single pass

These are not marginal improvements. The Terminal-Bench score is the highest any publicly available model has posted. The GDPval number puts GPT-5.5 clearly above the human expert baseline across knowledge work categories. The OSWorld score builds on GPT-5.4's breakthrough performance (which first crossed the human threshold) and extends it further.

What "Fully Retrained" Means

GPT-5.5 is not a fine-tune of GPT-5.4. OpenAI retrained the model from the ground up with agentic capability as the primary objective — not a secondary optimization pass after general training. The architecture, training data strategy, and reinforcement learning objectives were all designed around the question: how do you build a model that can do real professional work autonomously across extended sessions?

The answer, based on what GPT-5.5 does in practice: better planning across long task horizons, stronger self-correction when intermediate steps produce unexpected results, and more coherent tool use across sequences of actions that span web browsing, code execution, file management, and data analysis simultaneously.

Codex: GPT-5.5 as a Software Engineering Agent

GPT-5.5 now powers Codex, OpenAI's autonomous software engineering agent. The upgrade is immediately practical:

- Implementation and refactoring — takes natural language feature descriptions and produces working code with appropriate test coverage

- Debugging and root cause analysis — identifies failure modes across complex codebases, not just individual functions

- Testing and validation — writes and runs tests, interprets results, and iterates until the suite passes

- Multi-file and multi-service reasoning — maintains coherent understanding across entire repositories rather than isolated files

Notably, GPT-5.5 delivers better results in Codex with fewer tokens than GPT-5.4 did. The model is more direct — it wastes less context on hedging and repetition, which means longer effective working sessions and lower costs per completed engineering task.

Availability and Pricing

In ChatGPT:

- GPT-5.5 Thinking — available to Plus, Pro, Business, and Enterprise users

- GPT-5.5 Pro — available to Pro, Business, and Enterprise users, designed for harder questions requiring maximum accuracy

API pricing:

- GPT-5.5: $5 per million input tokens / $30 per million output tokens

- GPT-5.5 Pro: $30 per million input tokens / $180 per million output tokens

The per-token cost is approximately 2x higher than GPT-5.4. For teams where GPT-5.5's agentic capabilities complete work in fewer steps than GPT-5.4 required, the effective cost per task may remain comparable or improve. For high-volume batch applications, the pricing increase requires evaluation.

Why This Release Matters

GPT-5.4 was the model that first crossed the human performance threshold on desktop task automation. GPT-5.5 extends that lead and widens the domain — from computer use to coding to 44 categories of professional knowledge work. The model release cadence (GPT-5.5 arrived seven weeks after GPT-5.4) signals that OpenAI's training infrastructure is now operating at a pace where frontier capability improvements arrive quarterly.

For developers building agentic applications, for organizations piloting AI in knowledge work, and for engineers looking for a coding agent that handles full engineering tasks rather than isolated code snippets: GPT-5.5 is available in ChatGPT and Codex today.

Sources: OpenAI Newsroom (April 23, 2026), MarkTechPost (April 23, 2026), TechCrunch (April 23, 2026), 9to5Mac (April 23, 2026), Decrypt (April 23, 2026), Fortune (April 23, 2026)