▲ 19 r/AI_developers+7 crosspostsYour agent passes benchmarks. Then a tool returns bad JSON and everything falls apart. I built an open source harness to test that locally. Ollama supported!u/Busy_Weather_7064 — 1 day ago