u/Bootes-sphere

What Are You Building?
▲ 9 r/micro_saas+1 crossposts

What Are You Building?

Show us What you’ve been working on this week 👇🏽 Let’s Support Each Other

Name: OpenSourceAIHub.ai

What it does: We provide an AI Firewall that stops company data from leaking into LLM prompts and Smart LLM Router to cut LLM costs to great extent. Drop-in OpenAI SDK compatible proxy that adds real-time multi-modal DLP (PII redaction in text + images via OCR), blocks prompt injections, and autonomously routes to the cheapest/fastest model (Llama, Groq, Claude, Grok, etc.). 1M free credits, no card required.

Why use it:

  • 💸 Stop AI data leaks + cut LLM cost by 30% with one API
  • 🛡️ Security: Flexible DLP that automatically redact PII, PCI and other sensitive data such as emails, API keys, and SSNs in text and images (OCR) (28+ entities).
  • 💸 Cost Control: Smart-route requests between Groq, Together AI, DeepInfra, Mistral AI, Anthropic (Claude), OpenAI, Google Gemini, xAI (Grok) to save up to 90%.
  • 📊 Governance: Enforce per-project budgets and export audit-ready CSV logs.
  • ⚡ Ease: 100% OpenAI SDK compatible. Just change your baseURL and you're protected.

Latest Update: Just launched our Multi-modal OCR scan—we now catch PII in screenshots before they reach the model provider.

Discount: 1M Free credits upon signup. Wallet Top-ups and Pro BYOK tier.

Drop your project below, let’s support each other 👇🏽

u/Bootes-sphere — 5 hours ago
Architecture Review: Preventing "Shadow AI" data leaks with a stateless PII firewall
▲ 7 r/cybersecurity+1 crossposts

Architecture Review: Preventing "Shadow AI" data leaks with a stateless PII firewall

Most "AI Gateways" are just loggers. I’ve been working on a design for an active firewall that redacts sensitive data (PII, PCI, Secrets) before it reaches the LLM provider.

The Security Posture:

  1. Stateless Sovereignty: Prompts processed in volatile memory only. No content persistence.
  2. Fail-Closed Logic: If the scanner fails, the request is killed (500). Zero unscanned data leakage.
  3. IP Guard: Custom regex-based detection for internal project names and proprietary terminology.
  4. Multi-Modal: OCR-scan of images to catch PII in screenshots.
  5. Audit Trail: Metadata logging only (Violation type + timestamp).

I’m looking for feedback from security pros: If you were auditing a vendor like this, what is your #1 concern? Does "Metadata-only logging" satisfy your audit requirements for SOC2/HIPAA?

I’ve documented the architecture here: https://opensourceaihub.ai/security

Would love to hear where the "weak links" are in this proxy model.

u/Bootes-sphere — 5 hours ago
Prompt-level data leakage in LLM apps — are we underestimating this?

Prompt-level data leakage in LLM apps — are we underestimating this?

Something we ran into while working on LLM infra: Most applications treat prompts as “just input”, but in practice users paste all kinds of sensitive data into them. We analyzed prompt patterns across internal testing and early users and found:

- Frequent inclusion of PII (emails, names, phone numbers)

- Accidental exposure of secrets (API keys, tokens)

- Debug logs containing internal system data

This raises a few concerns:

  1. Prompt data is sent to third-party models (OpenAI, Anthropic, etc.)

  2. Many apps don’t have any filtering or auditing layer

  3. Users are not trained to treat prompts as sensitive

We built a lightweight detection layer (regex + entity detection) to flag:

- PII

- credentials

- financial identifiers

Not perfect, but surprisingly effective for common leakage patterns.

Quick demo here:

https://opensourceaihub.ai/ai-leak-checker

Curious how others here are thinking about this:

- Are you filtering prompts before sending?

- Or relying on provider-side policies?

- Any research or tools tackling this systematically?

u/Bootes-sphere — 5 hours ago
Two things that kept breaking for me: LLM costs and prompt leaks

Two things that kept breaking for me: LLM costs and prompt leaks

Been hacking on something recently and wanted to get a reality check.

While working with LLM APIs, I noticed two things pretty quickly:

costs can get unpredictable depending on the model, and people paste way more sensitive stuff into prompts than you’d expect.

The second one was kind of surprising.Stuff like API keys, emails, logs… just regular debugging type usage.And it all just gets sent straight out to whatever model you’re using.

I didn’t have anything in place either, so I added a thin layer in front to:

- catch obvious sensitive data  

- and route to cheaper/faster models when possible  

It’s pretty simple, but it actually helped more than I expected.Not sure if others are seeing this too or if I’m just over-indexing on my own use case.

u/Bootes-sphere — 6 hours ago
LLM costs and prompt leaks turned out to be bigger problems than I expected
▲ 3 r/learnmachinelearning+1 crossposts

LLM costs and prompt leaks turned out to be bigger problems than I expected

Been working on something recently and wanted a sanity check from people here.
While building with LLM APIs, I kept running into two things:

- costs getting kind of unpredictable depending on which model/provider was used  

- people pasting sensitive stuff into prompts without really thinking about it  
So I started putting a thin layer in front of the requests to catch obvious sensitive data before it leaves and route requests to cheaper/faster models when possible  

Nothing too fancy, just trying to solve the same issues I kept hitting. https://opensourceaihub.ai/

u/Bootes-sphere — 6 hours ago
LLM costs and prompt leaks turned out to be bigger problems than I expected
▲ 8 r/micro_saas+2 crossposts

LLM costs and prompt leaks turned out to be bigger problems than I expected

I have been working on something recently and wanted a sanity check from folks here.

While building with LLM APIs, I kept running into two issues:

- costs getting unpredictable depending on model/provider

- users occasionally pasting sensitive stuff into prompts without realizing it

So I started putting a thin layer in front of model calls to catch obvious sensitive data before it gets sent out route requests to cheaper/faster models when possible

Nothing too complex, just trying to solve a couple of practical problems I kept seeing.

Curious what you guys think about this tool.

u/Bootes-sphere — 6 hours ago
▲ 2 r/SaaS

What do you guys think about this idea?

I am building a tool to cut cost and protect prompt data that i think will help devs and startup founders who rely on LLM services and wholesale providers such as groq or together AI. Nothing too complex, just trying to solve a couple of practical problems I kept seeing.
While building with LLM APIs (Say anthropic claude), I kept running into two issues:

- costs getting unpredictable depending on model/provider

- users occasionally pasting sensitive stuff into prompts without realizing it
Been working on this recently and wanted a sanity check from folks here.

reddit.com
u/Bootes-sphere — 7 hours ago

How are we handling sensitive details and and internal logs into LLM prompts? (I will not promote)

Hey r/startups,

I am a software architect shipping LLM features into apps, everyday i see issues with sensitive content gets pushed into LLM of-course not intentionally. But either way that is a serious risk.

I mean as devs this is just normal workflow. But every one of those prompts was heading straight to whichever model we had connected, with zero filtering or redaction. It made me realize how exposed most LLM integrations probably are right now.

So I made a leak detector and proxy that I think will definitely help the startup founders and devs who is using LLM for their everyday tasks and integration workflows. Happy to share more details or architecture thoughts. I can share the link if anyone is interested. I would love to hear your feedback and valuable suggestions.

reddit.com
u/Bootes-sphere — 8 hours ago