u/omarous

▲ 24 r/LargeLanguageModels+9 crossposts

These are the top open and closed model: Opus 4.7, GPT-5.5 Pro, DeepSeek V4, GLM-5.1 and Gemini 3.1 Pro. They both show similar performance in my testing.

Open models: The only open models that have equivalent quality compared to the top models are DeepSeek and GLM.

Cost:

GPT 5.5 Pro: Super expensive it makes no sense (cost is around $2)
Gemini/Opus: $0.2/$0.1. Opus is cheaper as it consumed less tokens
DeepSeek/GLM: $0.019/$0.021 10-5 times cheaper than Gemini and Opus.

codeinput.com
u/omarous — 14 days ago
▲ 6 r/LLM+1 crossposts

Some of the larger models (like Llama) weren't available on OpenRouter, so I had to work with what was there.

  • Best small model: Gemma 4 26B For its size, I think it had the best output. You can see it even picked blue eyes for the husky.
  • Definitely useless: Llama 4 Maverick, gpt-oss-120b gpt-oss gets a point for at least painting something that resembles a dog.
  • Mid-tier: MiniMax M2.7, Qwen3.6 Max, Kimi K2.6 Lots of detail, but the dog isn't well-positioned.
  • Top-tier: GLM 5.1, DeepSeek V4 Pro Pretty darn close to usable.
codeinput.com
u/omarous — 14 days ago

For some reason the other Minimax models would not generate a response, though it might be an issue from the inference provider.

codeinput.com
u/omarous — 15 days ago