how do you scope an inventory from zero?
Our org is a mid-size financial services company, hybrid environment, mix of on-prem file servers (NetApp NAS), SharePoint Online, and a handful of AWS S3 buckets that different teams have spun up over the years. We're heading into a PCI DSS audit in about 4 months and the auditors want, evidence of a formal sensitive data inventory, not just a network diagram and a promise.
The problem we ran into: we don't actually know where all the cardholder data is. We assumed it was contained to three known systems. Turns out, after a spot check, there are Excel files with PANs sitting in SharePoint libraries that, haven't been touched since 2021, and at least two S3 buckets where nobody's sure what's in them anymore. Classic sprawl situation.
We tried to scope this manually first. Two people, three weeks, partial coverage of maybe 30% of the file shares. Not sustainable and still left the cloud storage completely unaddressed.
We ended up running Netwrix Data Discovery & Classification across the environment, which handled the hybrid scope really well, it covered the NAS and M365 in, the same pass rather than needing separate tools, and the incremental indexing meant we weren't hammering the file servers every time we needed a fresh scan. Took about two weeks to get a full picture, and it surfaced PAN data in locations we hadn't expected, including some Teams channel files. The fact that it ties discovery directly into risk reduction and audit evidence made it a, lot easier to build the case internally for doing this properly rather than just winging it.
Here's the specific question: once you have a classification run complete and you've identified, where the regulated data actually sits, what's your process for deciding what to remediate vs. what to just document and accept? We're debating whether to delete/move the stale SharePoint files outright or just apply tighter access controls and log it as a finding with compensating controls. The auditors haven't given clear guidance on which approach satisfies the intent of requirement 3.2 in this context. Has anyone navigated this with a QSA and gotten a definitive answer on what's acceptable?