r/NeoTiler

Hey everyone,

I recently came across Nvidia's official graphic comparing the RTX 5090 to the M3 Ultra, claiming a "2.7x speed advantage" in local AI tasks (LLMs). As a developer, it felt a bit like comparing apples to oranges especially with the M5 Ultra just around the corner at WWDC '26.

I did a deep dive into the architecture, memory bandwidth, and what happens when you try to run a 70B+ model on a 32GB VRAM card vs. Apple's Unified Memory.

A few key takeaways from my analysis:

The "2.7x" gap is mostly due to memory bandwidth, which the M5 Ultra is expected to bridge significantly (1.1 - 1.2 TB/s).
The RTX 5090 hits a wall with 70B+ models, while the M5 Ultra handles them entirely on-chip.
The power efficiency gap is still insane (575W vs ~100W).

I wrote a full breakdown of the specs, the "Single Die" rumor for the M5, and why Nvidia chose specific small models for their marketing.

Full article here

RTX 5090 vs M5 Ultra: Analyzing the "2.7x Faster" claim and what Nvidia didn't show you.

Me and the squad going to use neotiler