Two identical MPI jobs slow down drastically on Intel Alder Lake but not on Threadripper. Is it normal?
Hi everyone,
I regularly run multiple parallel MPI jobs simultaneously on my workstations. I have two systems:
- Intel i7-12700 (12 cores: 8 P-cores + 4 E-cores), OS: Ubuntu 20.04
- AMD Threadripper 3960X (24 cores, 48 threads), OS: Ubuntu 18.04
I wrote a simple C++ MPI test program that runs with mpirun -np 2. On both machines, a single instance finishes in about 12 seconds.
The problem appears when I run two instances at the same time (both mpirun -np 2):
- Threadripper: Both finish in ~12 seconds (no slowdown)
- Intel: Both take ~30 seconds (significant slowdown)
I tried pinning processes to specific cores using taskset and --cpu-set in mpirun. The processes do land on the correct cores (I verified with ps), but the slowdown persists.
Is this expected behavior for Alder Lake? Could the hybrid P-core/E-core architecture be causing memory bandwidth contention? Or am I missing something else?
I'm trying to figure out if my Intel system is performing normally or if I should be hunting for a configuration issue.
Additional notes:
- My code shows reasonable&normal speed-up with increasing core numbers on both systems
- The Intel PC has only one memory stick
- The AMD PC has multiple memory sticks
- My test code is not memory intensive (mostly CPU math)
I can provide more details if needed. I'm not super knowledgeable about CPU architectures, so apologies in advance.
Thanks for any insights!