u/AdVirtual2648

alibaba just dropped qwen 3.6-plus and the benchmarks are kind of ridiculous.

it's scoring 61.6 on terminal-bench and 57.1 on swe-bench verified. for context that puts it ahead of claude 4.5 opus, kimi k2.5, and gemini 3 pro on most of the agentic coding tests.

the crazy part is it's less than half the size of kimi k2.5 and glm-5. way smaller model but matching or beating the big ones.

it also has a native 1M context window which is huge if you're working on long codebases or big document tasks. and they built it specifically for agentic workflows so it's not just "generate code and hope for the best"... it actually handles multi-step tasks.

it's already free on openrouter too. open source versions coming soon apparently.

link's in the comments.

alibaba just dropped qwen 3.6-plus and the benchmarks are kind of ridiculous.

it's scoring 61.6 on terminal-bench and 57.1 on swe-bench verified. for context that puts it ahead of claude 4.5 opus, kimi k2.5, and gemini 3 pro on most of the agentic coding tests.

the crazy part is it's less than half the size of kimi k2.5 and glm-5. way smaller model but matching or beating the big ones.

it's already free on openrouter too. open source versions coming soon apparently.

link's in the comments.

Alibaba's Qwen3.6-Plus is beating Claude Opus in coding!!

Alibaba's Qwen3.6-Plus is beating Claude Opus in coding!!