u/Ok_Performer1647 — reddlx

Built a PE Malware Analysis Pipeline to Learn Why Most Detection Tools Suck at Correlation

I've been doing reverse engineering and malware analysis for sometime now, and I noticed something frustrating: every detection tool flags isolated signals separately. One tool screams "entropy is high!" Another yells "found injection APIs!" A third matches a YARA rule. But nobody tells you if these signals actually mean your binary is malicious or just legitimate software doing normal things.

So I built Binary Atlas—a static PE analysis engine that runs 14 detectors but scores confidence instead of just screaming alerts.

Why This Matters:

Most tools have insane false positive rates on legitimate Windows utilities

Single signals (high entropy, API imports, YARA matches) are meaningless in isolation

Correlation > Isolation

How It Works (5 Steps):

Check if Windows trusts it (valid Authenticode signature) → LOW risk

Parse PE headers, sections, imports, strings, hashes

Run 14 detectors (packing, anti-analysis, persistence, shellcode, etc.)

Unified classifier deduplicates findings and weights signals

Score confidence (HIGH/MEDIUM/LOW) + generate detailed reports

What Makes It Different:

Instead of: "Found CreateRemoteThread—FLAGGED!"

Binary Atlas does:

CreateRemoteThread detected ✓ (confidence: MEDIUM—debuggers use this)

WriteProcessMemory detected ✓ (confidence: MEDIUM—could be legitimate)

Registry persistence APIs detected ✓ (confidence: MEDIUM)

Anti-debug checks in strings ✓ (confidence: MEDIUM)

Unified result: "All 4 signals pointing toward injection + persistence = HIGH confidence malware"

The 14 Detectors:

Static analysis only ( To be honest sandboxin the file confirms everything)

High false positives on some legitimate software

Looking for feedback on:

How to reduce false positives further?

Which detection modules would be most useful?

Any malware researchers want to contribute better YARA rules?

Checkout Github: https://github.com/bilal0x0002-sketch/Binary-Atlas/

u/Ok_Performer1647 — 1 day ago

▲ 9 r/Malware+1 crossposts

So I built Binary Atlas—a static PE analysis engine that runs 14 detectors but scores confidence instead of just screaming alerts.

Why This Matters:

Most tools have insane false positive rates on legitimate Windows utilities

Single signals (high entropy, API imports, YARA matches) are meaningless in isolation

Correlation > Isolation

How It Works (5 Steps):

Check if Windows trusts it (valid Authenticode signature) → LOW risk

Parse PE headers, sections, imports, strings, hashes

Run 14 detectors (packing, anti-analysis, persistence, shellcode, etc.)

Unified classifier deduplicates findings and weights signals

Score confidence (HIGH/MEDIUM/LOW) + generate detailed reports

What Makes It Different:

Instead of: "Found CreateRemoteThread—FLAGGED!"

Binary Atlas does:

CreateRemoteThread detected ✓ (confidence: MEDIUM—debuggers use this)

WriteProcessMemory detected ✓ (confidence: MEDIUM—could be legitimate)

Registry persistence APIs detected ✓ (confidence: MEDIUM)

Anti-debug checks in strings ✓ (confidence: MEDIUM)

Unified result: "All 4 signals pointing toward injection + persistence = HIGH confidence malware"

The 14 Detectors:

Static analysis only ( To be honest sandboxin the file confirms everything)

High false positives on some legitimate software

Looking for feedback on:

How to reduce false positives further?

Which detection modules would be most useful?

Any malware researchers want to contribute better YARA rules?

Checkout Github: https://github.com/bilal0x0002-sketch/Binary-Atlas/

u/Ok_Performer1647 — 10 days ago

▲ 1 r/antivirus

I’ve been working on a PE static analysis pipeline and ran into a consistent issue:
individual signals (imports, entropy, strings, section layout) were too weak on their own to make reliable decisions.

For example:

high entropy alone didn’t reliably indicate packing
suspicious imports often appeared in legitimate software
string-based indicators were easy to evade or noise-heavy

What worked better was correlating multiple signals together instead of treating them independently:

imports + entropy
section anomalies + packing heuristics
API patterns + string artifacts

This improved triage, but still didn’t solve the core problem — it’s still not fully reliable without dynamic analysis. I still end up using sandboxing for confirmation.

I built a small tool around this idea (14 static analysis modules + scoring system), but false positives are still a major challenge.

Curious how others approach this in practice:

rule-based scoring systems?
weighted heuristics?
ML-based approaches?
hybrid static + dynamic pipelines?

reddit.com

u/Ok_Performer1647 — 17 days ago