u/valalalalala

Radeon R9700 fails under Ubuntu 24

Computer Type: Desktop

GPU: AMD Radeon AI PRO R9700 32GB (XFX)

CPU: AMD Ryzen 5 5500

Motherboard: Consumer AM4 desktop board (no ACPI TMR exposed)

BIOS Version: July 2025 VBIOS P02 (07/25/2025)

RAM: N/A

PSU: N/A

Case: N/A

Operating System & Version: Ubuntu 24.04

GPU Drivers: amdgpu-dkms 6.16.13 (ROCm 7.2.3)

Kernel Version: 6.17.0-23-generic

Chipset Drivers: N/A

Background Applications: N/A

Description of Original Problem:
amdgpu fails to initialize the AMD Radeon AI PRO R9700 (gfx1201 / Navi48) when using the July 2025 VBIOS (P02). The card never fully initializes for compute:

  • no /dev/accel/accel0
  • no KFD
  • rocminfo shows CPU only

Diagnostic patch output:

amdgpu: ip discovery actual sig: 0xce854377 expected: 0x28211407

Investigation shows the VBIOS now writes a PSP-encrypted IP discovery binary into VRAM using signature 0xce854377 instead of the expected plaintext format 0x28211407.

The payload appears encrypted/ciphertext beyond the first few bytes:

00000000: 77 43 85 ce ...
00000010: 8b e3 d7 42 52 fb ...

amdgpu_discovery_check_binary_valid() rejects the blob immediately, preventing DRM accel/KFD initialization.

Additional findings:

  • IP_DISCOVERY_V4 exists in amdgpu_discovery.c
  • but there is no actual V4 implementation
  • sysmem/ACPI fallback returns -ENOENT
  • most consumer Ryzen desktop boards do not expose the required ACPI TMR table
  • encrypted blob may exceed DISCOVERY_TMR_SIZE (10KB)

This appears to affect Navi48/gfx1201 cards shipping with July 2025+ VBIOS revisions.

Troubleshooting:

  • Tested with exp_hw_support=1
  • Tested multiple kernels including 6.10.5, 6.16.13, and 6.17
  • Confirmed failure occurs in amdgpu_discovery_check_binary_valid()
  • Verified actual VRAM discovery signature is 0xce854377
  • Confirmed sysmem fallback path returns -ENOENT
  • Dumped VRAM discovery region and confirmed payload appears PSP-encrypted
  • Observed only reference to device 0x7551 in current source is kicker_device_list in amdgpu_discovery.c

Potential fixes:

  1. Implement PSP/TMR decryption support for IP_DISCOVERY_V4
  2. Provide compatible non-encrypted VBIOS
  3. Increase DISCOVERY_TMR_SIZE

Curious if anyone else with newer Navi48/R9700 boards is seeing this behavior.

reddit.com
u/valalalalala — 6 days ago
▲ 2 r/ROCm

Help: r9700 fails under Ubuntu 24

Came across claims there is an issue with newer VBIOS version and hoping someone has found a path forward or at least fellow suffers.

I was working with Claude trying to diagnose the problem, hence the "Root Cause" analysis below, but at this point I'm just not sure how to move forward.

Hardware

  • GPU: AMD Radeon AI PRO R9700 (XFX)
  • Device ID: 0x7551
  • ASIC: gfx1201 / Navi48
  • VBIOS: P02 dated 07/25/2025
  • Kernel: 6.17.0-23
  • Driver: amdgpu-dkms 6.16.13 (ROCm 7.2.3)
  • Ubuntu 24.04

Symptoms

amdgpu fails during IP discovery initialization:

  • no /dev/accel/accel0
  • no KFD
  • rocminfo shows CPU only

Diagnostic patch output:

amdgpu: ip discovery actual sig: 0xce854377 expected: 0x28211407

Root Cause

The July 2025 VBIOS writes a PSP-encrypted IP discovery binary into VRAM.

Current drivers expect the old plaintext signature: 0x28211407

But the new VBIOS emits: 0xce854377

The payload appears encrypted/ciphertext beyond the first few bytes:

00000000: 77 43 85 ce ...
00000010: 8b e3 d7 42 52 fb ...

amdgpu_discovery_check_binary_valid() rejects it immediately, so the GPU never fully initializes.

Interesting Findings

  • IP_DISCOVERY_V4 is defined in amdgpu_discovery.c
  • But there is no actual V4 implementation
  • Strongly suggests AMD planned support for this encrypted format but did not ship the decryption path yet

Also:

  • the ACPI/sysmem fallback path returns -ENOENT
  • consumer Ryzen desktop boards generally do not expose the required ACPI TMR table
  • so desktop users have no fallback path

Additional Problem

The encrypted blob appears larger than:

DISCOVERY_TMR_SIZE = 10240

So even after decryption support lands, the buffer size may also need increasing.

Likely Fixes

  1. Implement PSP/TMR decryption support for the new 0xce854377 discovery format
  2. Or provide a non-encrypted VBIOS compatible with existing drivers
  3. Possibly increase DISCOVERY_TMR_SIZE

Impact

This may affect all Navi48 / gfx1201 / R9700 cards shipping with July 2025+ VBIOS revisions on Linux.

reddit.com
u/valalalalala — 6 days ago