
Benchmarking Coding Agents: What’s actually working best with OS models right now?
Artificial Analysis Coding Agent Index (Source: Artificial Analysis)
As a huge fan of Artificial Analysis, I was checking the model indexes and benchmarks when I noticed a new section focusing on coding agent performance, specifically comparing performance across different agents and models.
As a heavy user of coding agents with open-source/weight models, this is critical information to have. This leads me to a question: in your experience, which are the best coding agents to use with open-source models?
Currently, I use the Claude Code + Kimi 2.6 combo, but I’d like to know your thoughts :)