u/Any_Artichoke7750

Better options than vendor-managed Docker security images?

 vendor handles the scanning part of our docker security stack. every week their own components show new CVEs in the scanner image.

we open tickets, they either get marked low priority or sit without response. last real reply was weeks ago.

compliance doesn’t care where it comes from. scan fails, audit flags it, and it lands on us.

we tried pushing contract clauses around secure delivery and patch timelines, but once it’s upstream OSS inside their image, everything slows down.

right now we’re logging formal risk acceptances with compensating controls just to stay audit compliant. documented, signed, reviewed.

starting to feel like the bigger issue is relying on vendor-bundled images we don’t control.

has anyone managed to get vendors to move on this, or did you reduce dependency on their images?

reddit.com
u/Any_Artichoke7750 — 2 days ago

I cannot even process what happened today. We built this whole system around an anti bot browser agent using stealth web scraping techniques for MFA browser automation. Thought we were so smart using a fancy AI agent browser tool that relies on fixed CSS selectors to interact with client websites. Our demos even featured our human like web automation.

This morning the main client site does a tiny UI refresh. They change one button class from 'submit btn primary' to 'btn primary submit'. Thats it. Our entire automation pipeline explodes. Every single task fails because the selectors no longer match. Hundreds of pending jobs across 15 client accounts just halt. Production scraping stops dead. Users see errors everywhere. Support lines blow up.

I spent the whole day in emergency mode manually clicking through browsers while our team scrambles to update selectors. Turns out this has happened four times in the last year with different sites. We are stuck in this constant maintenance hell because the tool depends on these fragile fixed structures. Clients are yelling about SLAs and we look like complete idiots.

Need advice on changing to something like computer vision AI for browser tasks that adapts without breaking every time. Has anyone else had their browser automation tool nuke production from a minor UI tweak?

reddit.com
u/Any_Artichoke7750 — 8 days ago

Recurring Orphan account audit Findings every Quarter? How to fix Unmanaged In-House Apps with Okta & SailPoint

Third quarter in a row our access review flagged orphan accounts in the same three apps. We close them, document it, move on. Next quarter, same apps, same finding.

~700 people. Okta for SSO, SailPoint for governance. These apps were built in-house years ago and never really got onboarded into anything central. Every joiner/mover/leaver is handled manually if someone remembers. Most of the time they don't.

Auditors called it a process gap. But the process isn't the issue.

The apps aren't part of any real governance workflow — no IdP connection, no IGA coverage, no automated provisioning or deprovisioning. Every fix is manual and temporary because the visibility underneath doesn't exist.

We're fixing symptoms every quarter because nothing structural changed.

Has anyone  broken this cycle or does it just keep looping until something worse forces it?

reddit.com
u/Any_Artichoke7750 — 8 days ago

8 months of sending weekly security reports. Engineering triages maybe 30% of each one. The rest ships.

I don't think either side is wrong. Reports are 200 plus findings, half of which are false positives or packages that exist in the build layer but not at runtime. Nobody has 3 hours a week to go through all of it so people skim, pick what looks bad, and move on. Real criticals get buried in the same list as everything else.

Tried requiring security sign-off before deploys. VP overrode it 6 weeks in during a release crunch. After that everyone knew it was soft and the process never recovered.

At this point I genuinely don't know if this is a tooling problem or just a people and incentives problem. 

Has anyone  gotten engineering to consistently engage with security findings or does it always end up like this when release pressure is high.

reddit.com
u/Any_Artichoke7750 — 17 days ago

We’re a mid-sized org, around 650 people, running Okta as the main IdP and SailPoint for access reviews. The problem is not the apps already connected to Okta. It’s everything that never made it there.

Custom internal tools with local user tables. Older admin portals still using basic auth. Vendor apps someone set up before we had a real IAM process. A few apps support SAML but were never federated. Some have service accounts nobody owns anymore.

That is the part our current stack does not really answer. Okta shows what is onboarded. SailPoint governs what was connected. CASB catches some SaaS usage. None of them give us a clean view of the full application estate or which apps sit outside central identity.

I’ve been looking at a few options:

  • Orchid Security seems focused on finding unmanaged apps and apps sitting outside normal identity controls, including things missing from Okta/Entra/IGA. Not sure how well it handles custom internal apps and local auth.
  • SailPoint is useful for governance, but depends on the app being known and connected first.
  • Saviynt is good for governance and compliance, less clear to me on unknown app discovery.
  • Microsoft Entra ID Governance seems strongest once the app is already part of the identity process.
  • Lumos looks interesting for SaaS inventory, not sure how deep it goes into internal or custom apps.

Questions I’m trying to answer:

Can any of these discover apps that are not federated through the IdP. Do they identify local user stores and orphaned accounts, or mostly show inventory

How are people mapping app owners when the original team is gone?

Not trying to replace IGA. Trying to find what exists outside the identity inventory before auditors do.

reddit.com
u/Any_Artichoke7750 — 18 days ago

started with 3 sites, all in the same region. visibility was fine, everything fed into one dashboard, team could see what was happening.

added 8 more sites over 18 months, spread across US, Europe. That is where it fell apart.

not the connectivity. connectivity held up. problem was that the security visibility tools we had were built around the assumption that traffic stays regional. once we had sites in multiple regions, log aggregation started lagging, alerts were firing with 20 to 40 minute delays, and correlation across sites was basically manual.

found out about a policy violation  in eu 2 days after it happened. Not because the tool missed it, it logged it fine. But nobody was watching that feed and the alert routing was never set up for that region properly.

the monitoring that worked at 4 sites does not scale the same way to 11. I do not think that is controversial. But what I did not expect was how fast it got unmanageable and how much of it was configuration we never updated as we grew.

trying to figure out if this is a tooling problem or just operational gaps we need to close. Anyone dealt with visibility breaking down as the environment scaled globally? What actually helped?

reddit.com
u/Any_Artichoke7750 — 21 days ago