r/aisecurity

Built my first 3 microlearning lectures - feedback welcome
▲ 9 r/aisecurity+3 crossposts

Built my first 3 microlearning lectures - feedback welcome

I built three first microlearning lectures about Claude Code Basics. My target audience would be technical people which want to learn the basics before they use it

Getting started with MCPs

https://app.scibly.com/student/worksheets/cmowyggwd00000ajonr4zzb4p/editor?v=cmox03jr600000al9lzxl3w0w

Claude Code permission modes

https://app.scibly.com/en/student/worksheets/cmowi3e0400000ajlfo5ohpe8/editor?v=cmowi3e0s00010ajl9jd51apr

Claude Code sub-agents

https://app.scibly.com/en/student/worksheets/cmowha9ps00000ai82nkqn2sv/editor?v=cmowha9ql00010ai8gue3q4x9

I would appreciate all feedback and critique to improve them in the future so that learners can effectively use them 

u/chefkoch-24 — 4 days ago

How should AI coding agents be contained before tool calls execute?

AI coding agents are starting to do more than suggest code: they can run shell commands, read local files, call tools/MCP servers, and modify config using the user’s permissions.

From a security point of view, I’m trying to think through where containment should happen. The risky part seems to be unsafe action before the human notices, not just bad advice.

For people working with coding agents:

What actions would you block by default?

Examples I’m thinking about:

  • destructive shell commands
  • access to secrets or SSH keys
  • modifying security-sensitive config
  • network calls to unknown destinations
  • installing packages or running downloaded scripts
  • MCP/tool calls with broad permissions

Also curious:

What false positives would make this unusable?

Is local pre-execution enforcement the right layer, or should this be handled by sandboxing, identity/permissions, audit logs, rollback/snapshots, or something else?

reddit.com
u/Gary_AIAGENTLENS — 3 days ago
▲ 8 r/aisecurity+1 crossposts

Hey everyone,

I’ve been working on a project to solve a major problem in AI security: Traditional SAST tools (Snyk, SonarQube, etc.) are blind to "Agentic Logic" bugs. They look for bad strings, but they don't understand how user data can hijack an LLM’s instructions.

I built a deterministic engine called RepoInspect that merges AST-aware taint tracking with autonomous AI agents. To test it, I ran it against LangChain, and it flagged 10 high-severity vulnerabilities that had been missed by standard tools.

The most common issue: Instruction Hijacking (LLM01) In several built-in chains (like the LLMMathChain), user input is interpolated directly into a prompt template that tells the model to generate executable Python code (for numexpr).

The Attack Vector: Because the user {input} isn't delimited (no XML tags, no isolation), an attacker can simply "ask" the model to generate malicious system commands instead of a math expression. Since the chain executes that code immediately, it’s a direct path to code execution via a prompt.

Key Findings in the Audit:

  • Prompt Injection: 10+ cases in agents (Self-Ask, JSON Chat) and chains.
  • Excessive Agency: Critical risks in utility wrappers exposing API keys.
  • Insecure Deserialization: Risks in how some vector store adapters handle metadata.

Why I’m sharing this: I’ve open-sourced the engine and the full forensic reports for LangChain, OpenAI, and Dify. I want to help developers move beyond "hope-based security" for their RAG and Agentic pipelines.

I'm curious to hear from other researchers—besides XML delimiters and system message isolation, what "hard" defenses are you using to protect your agents from hijacking?Adding github repo in the comments.

reddit.com
u/WinterSpecial7970 — 11 days ago