u/DavidG117

Kimi 2.6 thinks for a very long time.

Does anyone else notice this? Kimi 2.6 goes on to sometimes write a literal dissertation in its thinking step. Whereas other models get to editing sooner.

reddit.com
u/DavidG117 — 4 days ago

Simple ollama local modal edit predictions setup.

Setting up ollama local model for FIM (fill-in-the-middle predictions), the model you use needs to be FIM compatible,, like qwen2.5-coder:3b

"edit_predictions": {
    "mode": "eager",
    "provider": "ollama",
    "ollama": {
      "api_url": "http://localhost:11434",
      "model": "qwen2.5-coder:3b",
      "prompt_format": "infer",
      "max_output_tokens": 64,
    },
  },
u/DavidG117 — 5 days ago