Hi everyone
I am working on an experiment where I want to analyze raw network traffic (PCAP files from Wireshark) and then ask natural language questions on top of that data using an LLM via MCP (Model Context Protocol) server.

Goal (high level):

Capture traffic using Wireshark / PCAP
Analyze raw packet‑level data (not just summaries)
Expose this data to an MCP server
Ask NLQ questions, e.g.:
- “Is there any suspicious traffic spike?”
- “Which IP is generating abnormal packets?”
- “What protocols dominated during the outage?”
I want to keep the system low‑cost, serverless, and focused on deep raw‑data analysis, not just summaries.

Any guidance, examples, or design suggestions would be greatly appreciated.
Thanks in advance!

Hi everyone,

I’m exploring a solution around network traffic analysis using Wireshark (PCAP) data and would really appreciate guidance from people who have built something similar.

Use case

I have Wireshark PCAP files containing network traffic data. My goal is to enable Natural Language Queries (NLQ) such as:

“Why were HTTPS connections failing yesterday?”
“Which IP generated the most TCP resets?”
“Is this traffic spike abnormal compared to baseline?”

I want the system to:

Reason over the packet data (not just keyword search)
Provide human‑readable explanations, not raw logs
Be usable by people who are not networking experts

From my research so far, it seems like:

Raw PCAP files need to be parsed and converted into structured data
Classical ML might help with anomaly detection or baselining
Generative AI + tool‑based reasoning (e.g., using LLMs) is required for NLQ and explanation
MCP‑style or tool‑augmented approaches seem promising for controlled access to data

I’m specifically looking for advice on the following:

Architecture
- What would a practical, production‑ready architecture look like for NLQ over network telemetry?
- Any proven design patterns for combining structured packet data + LLM reasoning?
Machine Learning
- Where does classical ML realistically fit here (if at all)?
- Is ML useful only for anomaly flags, or can it contribute more meaningfully?
Cost
- How expensive does this get in practice (LLMs, storage, query engines)?
- Any ways to keep costs predictable (e.g., summarization layers, caching, batching)?
Ease of use
- Are there approaches/tools that minimize heavy ML engineering?
- Any open‑source stacks that people have successfully used?
Cloud vs self‑hosted
- Has anyone compared Azure OpenAI / OpenAI‑based approaches vs self‑hosted LLMs for this kind of workload?

Outcome I’m hoping for

A system where:

Users ask plain‑English questions
The system queries structured network data
Applies domain knowledge
Returns clear explanations

If you’ve built, evaluated, or even considered something similar, I’d love to hear:

What worked
What didn’t
What you’d do differently

Thanks in advance!

u/19khushboo

Use case