Skip to main content
The Quantum Dispatch
Back to Home
swe-bench

Articles Tagged “Swe Bench

4 articles found

AI

Mistral Medium 3.5 Lands as a 128B Open-Weight Coder With Cloud Vibe Remote Agents

Mistral AI shipped Medium 3.5 on April 29, 2026 — a 128B-parameter dense multimodal model with a 256K context window, modified-MIT open weights, and a new Vibe remote agent runtime that hits 77.6% on SWE-Bench Verified.

Dr. Nova Chen
Dr. Nova ChenMay 3, 20266 min read
AI

Claude Opus 4.6 Claims #1 on LMSYS Arena Across All Three Leaderboards

Anthropic's Claude Opus 4.6 now holds the #1 spot on LMSYS Chatbot Arena in text, coding, and search — the first AI model to top all three simultaneously.

Dr. Nova Chen
Dr. Nova ChenApr 11, 20265 min read
AI

GLM-5.1 Goes Open-Source and Hits #1 on SWE-Bench Pro — Beating Every Closed AI Model

Z.ai's GLM-5.1 is a 754B open-weight MoE model under the MIT license — and it just took #1 on SWE-Bench Pro, outscoring every major closed model.

Dr. Nova Chen
Dr. Nova ChenApr 10, 20264 min read
AI

Cursor Composer 2: Frontier AI Coding at 86% Lower Cost

Cursor launches Composer 2, a code-only AI agent with a 200K token context window, 73.7 SWE-bench score, and 86% lower cost than its predecessor.

Dr. Nova Chen
Dr. Nova ChenMar 26, 20264 min read