
I built a coding agent in Go that puts a secret-scanning firewall between your code and the LLM (works with Ollama too)
Every AI coding agent I've used treats security as a permission prompt: "allow this bash command? y/N". That's fine for catching rm -rf / mid-agent. It does nothing about the prompt that just got built from your repo and is about to ship a .env value, a private key, or a customer ID to api.anthropic.com.
So I wrote gnoma, a coding agent in Go where security isn't a permission UI — it's a layer the rest of the code can't bypass.
Architecture, top to bottom:
- Outbound firewall on the provider boundary. Every provider — Anthropic, OpenAI, Gemini, Mistral, Ollama, llama.cpp — is wrapped in a
SafeProvider. There is one code path from gnoma's internals to any LLM endpoint, and it goes through a scanner that runs regex patterns (AWS keys, GCP service accounts, Stripe, GitHub PATs, private-key PEMs, etc.) plus a Shannon-entropy detector on the outgoing message and system prompt. Hits are redacted, blocked, or warned per config — before the network call. - Tool-result redaction on the way back. A
git diffthat surfaces a private key, acat .env, a curl response — all scanned before the LLM ever sees them. Same scanner, opposite direction. - TOFU plugin pinning. Plugins (which can ship hooks and MCP servers — i.e. arbitrary binaries running as you) get their
plugin.jsonSHA-256-pinned on first load. Manifest changes on disk = plugin refuses to load. SSH host-key discipline, applied to LLM tooling. No opt-out. - TOCTOU-safe path canonicalization. The classic sandbox escape — "leaf doesn't exist, so
EvalSymlinkserrors, so the caller skips the symlink check, so the write proceeds through a symlinked parent and lands outside the workspace" — gets defeated by walking back to an existing ancestor, resolving it, then rejoining the tail. - Permission modes with deny rules that are bypass-immune. Six modes (
default,accept_edits,bypass,plan,deny,auto). Deny rules fire before any mode check, includingbypass. Compound commands likeecho ok && rm -rf /are split with a proper POSIX shell parser, so anrm -rfdeny isn't smuggled past in a&&chain. - Incognito.
Ctrl+Xtoggles a mode where the session isn't persisted, the router doesn't learn from the turn, and there's no on-disk trace of the conversation.
What it actually is, beyond the security layer:
A provider-agnostic coding agent. Multi-armed bandit router across whatever providers you have configured — cloud or local. A tiny SLM (≤1B, on Ollama / llama.cpp / llamafile) classifies every prompt and handles the trivial ones itself so the heavy model only runs on real work. MCP servers, skills, hooks, plugins. One static Go binary, CGO_ENABLED=0, no Node/Python runtime.
What it doesn't do:
- Not a full network sandbox. The scanner is on the LLM provider boundary; if a tool you allowed shells out to
curl, that's still on you. - The plugin pin covers
plugin.json, not the binaries it references. Treat the plugin directory itself as a filesystem-permissions trust boundary. - No published benchmark numbers. The value prop is the architecture, not a score.
Install:
# pre-built binary (linux / macos / windows × amd64 / arm64)
# grab the archive for your platform:
https://github.com/VikingOwl91/gnoma/releases
# go install
go install somegit.dev/Owlibou/gnoma/cmd/gnoma@latest
# docker (multi-arch)
docker pull ghcr.io/vikingowl91/gnoma:latest
docker run --rm -it -v "$PWD:/workspace" ghcr.io/vikingowl91/gnoma:latest
# from source
git clone https://github.com/VikingOwl91/gnoma && cd gnoma && make build
Point at any OpenAI-compatible endpoint:
gnoma
gnoma --provider ollama --model qwen2.5-coder:3b
gnoma --provider llamacpp # uses whatever your llama-server reports
Apache-2.0. Source: https://github.com/VikingOwl91/gnoma
Happy to go deep on the firewall design, the TOFU threat model, or the path canonicalization edge cases.