u/Still_Piglet9217

EU AI Act enforcement starts in 75 days - affects any team building AI agents for European clients

If you're building AI agents or SaaS products used by European companies (or processing EU resident data), the EU AI Act applies to you regardless of where your company is based.

Full enforcement for high-risk systems starts August 2, 2026. High-risk means: credit scoring, recruitment filtering, healthcare triage, education assessment, critical infrastructure.

The practical requirements:

* Automatic decision logging (not optional)
* 6-month minimum log retention
* Technical documentation of your detection pipeline
* Human oversight architecture
* Accuracy and bias testing documentation

Fines: up to 35M euros or 7% of global turnover.

I broke down what the regulation requires, what auditors check, and realistic steps before the deadline in the link in comments.

Worth reading if your team is building anything AI-related for the European market.

reddit.com
u/Still_Piglet9217 — 2 days ago

EU AI Act enforcement starts in 75 days - affects any team building AI agents for European clients

If you're building AI agents or SaaS products used by European companies (or processing EU resident data), the EU AI Act applies to you regardless of where your company is based.

Full enforcement for high-risk systems starts August 2, 2026. High-risk means: credit scoring, recruitment filtering, healthcare triage, education assessment, critical infrastructure.

The practical requirements:

  • Automatic decision logging (not optional)
  • 6-month minimum log retention
  • Technical documentation of your detection pipeline
  • Human oversight architecture
  • Accuracy and bias testing documentation

Fines: up to 35M euros or 7% of global turnover.

I broke down what the regulation requires, what auditors check, and realistic steps before the deadline. In link below

Worth reading if your team is building anything AI-related for the European market.

reddit.com
u/Still_Piglet9217 — 2 days ago

The AI agent supply chain is getting attacked hard - here's what happened in the last 4 months

Been tracking AI agent security incidents since January. The supply chain attacks are escalating fast and I don't think enough teams are paying attention.

Quick rundown of what happened:

LiteLLM PyPI compromise (March 2026) Someone backdoored litellm 1.82.7/1.82.8 on PyPI. It was live for only 40 minutes but got 40K+ downloads. The payload harvested credentials, moved laterally through K8s clusters, and set up a persistent backdoor. The crazy part the attackers got in by compromising Trivy, the security scanner in LiteLLM's CI pipeline. Security tools becoming the attack vector.

ClawHavoc (January 2026) 1,184 malicious skills uploaded to OpenClaw's marketplace in 5 days. 12% of the entire registry was compromised. Each skill looked legit but contained credential stealers and reverse shells. 135K agents potentially exposed.

ContextCrush (February 2026) Context7 MCP server had a feature where library owners could set "AI Instructions." No sanitization. Attackers registered fake libraries, embedded malicious instructions, and any coding agent querying that library would execute them - reading .env files, exfiltrating creds.

OX Security findings (April 2026) 7,000+ public MCP servers, 150M+ downloads affected by an architectural flaw enabling arbitrary command execution.

The common thread: none of these attack the model itself. They attack the infrastructure agents depend on - packages, plugins, tool servers. Traditional SCA/SAST tools don't catch poisoned tool descriptions or a backdoored package that was only live for 40 minutes.

Wrote a detailed breakdown here if anyone wants the full technical analysis here

If you're building AI agents or integrating LLMs into production systems, the supply chain is the thing to watch right now. The attacks are monthly and scaling.

What's your team doing for agent supply chain security? Curious how engineering teams are handling this.

reddit.com
u/Still_Piglet9217 — 3 days ago
▲ 3 r/saasbuild+1 crossposts

Microsoft just confirmed prompt injection = RCE. Two CVSS 9.9 bugs in Semantic Kernel turned a chat message into calc.exe on the host.

Microsoft published a retrospective this week on two critical Semantic Kernel CVEs (CVE-2026-26030 and CVE-2026-25592) that were silently patched in February. Both scored CVSS 9.9.

The Python SDK vulnerability: the In-Memory Vector Store's search filter used eval() on user influenced input. A crafted filter value in a vector search broke out of the lambda and gave full code execution on the host. The .NET vulnerability let a hostile prompt steer the agent into writing arbitrary files via an unvalidated DownloadFileAsync helper.

One prompt. No exploit chain. No memory corruption. Just text that a model read and passed downstream to eval().

This isn't theoretical anymore. Every AI agent framework that wires models to tools faces the same architectural problem model output flowing into privileged operations with zero validation. LangChain had code execution bugs in 2023. AutoGPT shipped with unrestricted shell access. The difference is Semantic Kernel runs in Fortune 500 enterprises with access to prod databases and CI/CD.

Microsoft's own words: "once an AI model is wired to tools, prompt injection draws a thin line between content security and code execution."

We wrote up the full technical breakdown with implications for detection

Key takeaways:

  • The eval() pattern shows up constantly in AI tooling (vector store filters, plugin configs, tool parameter validators)
  • Traditional WAFs won't catch this - the payload looks like natural language with Python mixed in
  • Detection needs to understand downstream execution context, not just conversational jailbreaks
  • The fix is architectural (defense in depth, input scanning, strict schema validation) not procedural

Anyone else seeing eval() or equivalent dynamic execution in their AI agent stacks? Curious what frameworks people are running in prod and how they handle tool call validation.

reddit.com
u/Still_Piglet9217 — 10 days ago

We just published a technical breakdown of how we built the detection engine behind Secra (prompt injection detection API).

The short version: instead of throwing every input at an LLM and asking "is this malicious?" we use three layers that progressively escalate only when the previous layer can't make a confident call.

  • Layer 1 : Aho-Corasick pattern matching. 204 known bad strings scanned in a single pass. Under 1ms. Catches 62% of attacks on its own.
  • Layer 2 : Rule engine 8 detection categories (injection, jailbreak, goal hijacking, secret extraction, encoding attacks, etc.) running in parallel. Structural analysis, not just string matching.
  • Layer 3 : Groq LLM (Llama 3 8B). Only fires when layers 1+2 produce an ambiguous score (0.25-0.75 confidence band). Adds 200-400ms but only hits 7% of requests.

End result: 12ms median latency for 93% of scans. 0.3% false positive rate on enterprise prompts.

Full write-up click here

Interested in the architectural trade-offs, deterministic layers for debuggability vs. LLM for intent understanding. Share your thoughts still a lot to learn.

reddit.com
u/Still_Piglet9217 — 15 days ago

Been tracking prompt injection trends this year and the data is pretty clear at this point - direct injection (users typing malicious prompts) is now less than 20% of enterprise attack attempts. The rest enters through data pipelines.

Documents in RAG corpora. Webhook payloads. Tool responses from external APIs. Emails that AI assistants read as context. Shared docs with hidden instructions.

EchoLeak (CVE-2025-32711) hit Microsoft 365 Copilot this way - hidden text in an email that the assistant read, interpreted as instructions, and used to exfiltrate confidential data. No click required. The Slack AI exfiltration was similar - poison a public channel, extract private data from the RAG context.

The PoisonedRAG paper at Usenix showed 90% attack success by injecting just 5 documents into a database of millions.

Most teams secure the model endpoint and ignore the ingestion path. Output filters, rate limits, content classifiers - all useful, all pointed at the wrong layer. The pipeline that feeds context to the model is where trust gets assigned, and that's where it breaks.

Wrote up the full breakdown with the CVEs and what actually works as defense here

Curious if anyone else is seeing this shift in their own threat models?

reddit.com
u/Still_Piglet9217 — 16 days ago

Prompt injection stopped being a chatbot trick this year. Here are the five patterns that changed the threat landscape, with real CVEs and incidents behind each one.

  1. Zero-click data exfiltration. EchoLeak (CVE-2025-32711) hit Microsoft 365 Copilot. A crafted email with hidden text exfiltrated confidential data without the user clicking anything. 60% of enterprise AI copilots showed exfil vulnerabilities in red-team testing.
  2. Tool-call hijacking. AI agents now call APIs, write code, and query databases. Google's Jules agent got fully owned through a single injection. A hidden PR title caused GitHub Copilot, Claude Code, and Gemini CLI to leak their own API keys. OWASP now lists tool misuse as a critical agentic AI risk.
  3. Memory poisoning. Researchers showed that indirect injection can corrupt an agent's long-term memory. The agent develops persistent false beliefs that survive across sessions. Think rootkit, but for AI.
  4. Supply chain attacks. The ClawHavoc campaign uploaded 1,100+ malicious MCP tools to ClawHub. Install one and you get info-stealing malware with whatever permissions the AI agent holds.
  5. Multi-language evasion. Attackers split injection payloads across Mandarin, Arabic, and Portuguese to bypass English-trained classifiers. Unit 42 found these in live production attacks, not just papers.

All five exploit the same root cause: LLMs cannot tell the difference between instructions and data. The defense that works is scanning inputs before they hit the model, not after.

Full write-up with more detail on each pattern: link to https://www.sec-ra.com/blog/prompt-injection-2026-five-attack-patterns

reddit.com
u/Still_Piglet9217 — 17 days ago