
I’ve been using AI coding agents more recently, and one thing keeps bothering me:
once an agent has access to tools, the real risk is not the prompt — it is the action it takes.
For example, a coding agent can potentially:
- read .env or local credentials
- run shell commands
- call external APIs
- push code
- modify infrastructure files
- interact with kubectl / terraform / cloud CLIs
For local experiments this may be fine, but in a work/devops environment it feels risky to just rely on “please don’t do dangerous things” in the prompt.
I’m curious how others are handling this.
Are you doing any of these?
- running agents only in containers
- blocking network access
- using read-only workspaces
- approval-gating risky commands
- restricting which files can be read
- using separate credentials for agents
- logging/auditing agent actions
- avoiding shell access completely
I’ve been experimenting with the idea of an execution boundary that decides whether an agent action should be allowed, denied, or require approval before it happens.
https://github.com/safe-agentic-world/nomos
How are you making AI agents safe enough to use around real repos or infrastructure?