Back to Home
swe-bench
Articles Tagged “Swe Bench”
4 articles found
AI
Mistral Medium 3.5 Lands as a 128B Open-Weight Coder With Cloud Vibe Remote Agents
Mistral AI shipped Medium 3.5 on April 29, 2026 — a 128B-parameter dense multimodal model with a 256K context window, modified-MIT open weights, and a new Vibe remote agent runtime that hits 77.6% on SWE-Bench Verified.
Dr. Nova Chen★May 3, 2026★6 min read
AI
Claude Opus 4.6 Claims #1 on LMSYS Arena Across All Three Leaderboards
Anthropic's Claude Opus 4.6 now holds the #1 spot on LMSYS Chatbot Arena in text, coding, and search — the first AI model to top all three simultaneously.
Dr. Nova Chen★Apr 11, 2026★5 min read
AI
GLM-5.1 Goes Open-Source and Hits #1 on SWE-Bench Pro — Beating Every Closed AI Model
Z.ai's GLM-5.1 is a 754B open-weight MoE model under the MIT license — and it just took #1 on SWE-Bench Pro, outscoring every major closed model.
Dr. Nova Chen★Apr 10, 2026★4 min read
AI
Cursor Composer 2: Frontier AI Coding at 86% Lower Cost
Cursor launches Composer 2, a code-only AI agent with a 200K token context window, 73.7 SWE-bench score, and 86% lower cost than its predecessor.
Dr. Nova Chen★Mar 26, 2026★4 min read




