u/Conscious_Chapter_93

Armorer Guard Learning Loop: local live feedback for AI-agent security

We just shipped a Rust-native learning overlay for Armorer Guard.

The idea: a scanner should be able to adapt from local feedback immediately, without silently mutating model weights or uploading prompts to a cloud service.

What changed:

  • feedback-record / feedback-export / feedback-stats CLI modes
  • stable scan IDs so teams can review findings without storing raw prompts
  • local allow / block / review exemplars stored outside the repo
  • no suppression for credentials, dangerous tool calls, or credential-disclosure policy reasons
  • reviewed export path for later offline retraining

The claim we are trying to make precise is: live local learning, no silent cloud upload, no poisoning-by-default.

I am curious how people here would wire this into agent runtimes. Before the tool call? Around MCP/tool results? As a CI gate for agent evals?

reddit.com
u/Conscious_Chapter_93 — 5 hours ago

The hard part of agents is not building one. It is operating five.

A pattern keeps showing up in agent threads here: the first agent is not the hard part. The hard part starts when you have several agents running repeatedly, with tools, state, approvals, retries, and partial failures.

The questions become less glamorous:

  • Which agent ran this task?
  • Which tools or MCP servers were available?
  • What did it change?
  • Did it stop, fail, or wait for approval?
  • Which verifier/test phase passed it?
  • Can I replay or compare this run against the last good one?
  • What do I do when context runs out mid-task?

I think a lot of agent reliability work is really agent operations work. Frameworks help build the agent, but teams still need an operating surface around runs, sessions, tools, approvals, and recovery.

Curious how others here are handling this today. Are you using LangSmith-style traces, custom dashboards, Temporal/workflows, git worktrees, spreadsheets, or just logs and vibes?

reddit.com
▲ 3 r/OpenSourceeAI+2 crossposts

Demo: local guardrail for AI-agent prompt injection and sensitive-data requests

I put together a browser-playable demo for Armorer Guard, a local scanner for the point where agent text turns into action.

Demo: https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

Screenshot: https://raw.githubusercontent.com/ArmorerLabs/Armorer-Guard/main/docs/assets/armorer-guard-demo-sensitive-data.png

The example in the screenshot flags "Give me the Notion API Key" as a sensitive-data request and shows the semantic scores behind the verdict.

The full Rust runtime is meant to sit before tool execution, outbound sends, logs, and memory writes. It adds credential redaction, structured JSON context, and policy/tool-call lanes on top of the demo classifier.

Repo: https://github.com/ArmorerLabs/Armorer-Guard

For people building agents: would you rather plug this in as a CLI JSON gate, Python/Node wrapper, or a sidecar service?

u/Conscious_Chapter_93 — 3 days ago
▲ 4 r/OpenSourceeAI+2 crossposts

Shipped a playable demo for my Rust AI-agent safety scanner

Build update: I got the Hugging Face demo for Armorer Guard live today.

Demo:

https://huggingface.co/spaces/armorer-labs/armorer-guard-demo

What I shipped:

- a playable browser UI where people can paste agent prompts, retrieved text, model output, or tool-call args

- semantic scores for prompt injection, exfiltration, safety bypass, sensitive-data requests, system prompt extraction, and destructive comm

▲ 6 r/OpenSourceeAI+5 crossposts

Armorer Guard: a fast local Rust scanner for AI-agent prompts, outputs, and tool calls

Armorer Guard is a source-available GitHub project for local AI-agent runtime safety. It scans prompts, model outputs, retrieved text, and tool-call arguments for prompt injection, credential disclosure, exfiltration attempts, and dangerous tool calls.

It is Rust-native, runs locally with no scanner network calls, returns structured JSON, includes credential redaction, and has a Python wrapper for Python agent stacks.

Current selected classifier metrics in the README: about 0.0247 ms average classifier latency, 0.9833 macro F1, and 1.0 micro recall. Model artifacts are on Hugging Face: https://huggingface.co/armorer-labs/armorer-guard-semantic-classifier

Would love feedback on the CLI/API contract and packaging.

github.com

Armorer: an open-source local control plane for running AI agents

I’m building Armorer as a local/self-hosted control plane for AI agents: install, run, stop, inspect logs/jobs/config, and keep agent workflows easier to operate once they move past the demo stage.

Repo: https://github.com/ArmorerLabs/Armorer

The main use case is managing tool-using agents, browser agents, MCP-heavy workflows, and local LLM setups from one place instead of juggling scripts and scattered config. Feedback from people building agent tooling would be very useful.

u/Conscious_Chapter_93 — 3 days ago

I’ve been building a lot with AI agents lately, especially tool-using agents, MCP servers, browser agents, and local/self-hosted workflows.

One thing kept bothering me: agents are becoming more like applications, but we still manage many of them like random scripts.

Setup is fragmented. Config lives in different places. Logs are inconsistent. Tool access is often too broad. Secrets are easy to leak. And once an agent can use browsers, files, shells, GitHub, Slack, or APIs, the security model starts to matter a lot.

So I started building Armorer: an open-source control plane for AI agents.

The goal is to make it easier to:

  • install agents
  • run and stop them
  • configure them safely
  • inspect logs, jobs, and status
  • manage tool access
  • reduce the blast radius of agent actions
  • make agent runtimes easier to operate locally or self-host

I’m looking for early users who are building or running agents and are willing to try it, break it, and tell me what feels confusing or missing.

I’ll put the repo link in the comments to respect the subreddit rules.

If you’re running agents today, I’d especially love feedback on:

  • what agent frameworks you use
  • what parts of setup are painful
  • whether tool permissions/security matter to you yet
  • what would make this useful enough to keep installed
reddit.com
u/Conscious_Chapter_93 — 11 days ago