
High-Speed Data Transfer on ZynqMP: Moving PL Data to NVMe at ~12 Gbps
Hey everyone,
This week, I tackled a data transfer challenge on a Zynq UltraScale+ MPSoC paired with a Gen-2 NVMe SSD. The goal was to stream image data continuously from a reserved PS DDR region (populated by PL) to persistent storage at 8+ Gbps.
After experimenting with different approaches—navigating generic-uio vs. udmabuf, O_DIRECT EFAULT headaches, and Linux CMA panics—I finally achieved near 12 Gbps transfer speeds in my pipeline! For context, my raw fio benchmarks showed a slightly higher maximum capability, so this real-world implementation is pushing very close to the hardware limits.
I've compiled my benchmarks, the pitfalls I encountered, and the final working architecture into a short Gist. I hope it saves some debugging time for anyone building high-throughput pipelines on embedded Linux:
https://gist.github.com/CaglayanDokme/9646e12533fe9ba84ef7f79906940956
I'd be glad to hear your feedback or learn how you folks handle similar zero-copy pipelines. Have a great weekend out there!
Special thanks to the author of udmabuf driver, Ichiro Kawazome. Without his driver, this work would be more cumbersome on my side.