u/DumBDiego

Antminer S9 - Chain 0 NaN chips and ASIC RT errors - Boot loop

​

Hi, I need help with an Antminer S9 that is currently stuck in a boot loop.

The symptoms:

Hashrate starts at 800GH-1TH and then drops to 100GH before restarting.

The Kernel Log shows ASIC RT errors on chips 29 and 57.

Chain 0 reports NaN status on chips 34, 50, and 62.

The log shows read failed for temperature sensors on chip 62.

The firmware is trying a Special fix for the middle temperature, but it doesn't solve the loop.

What I've tried:

Re-ordered the hashboards, but the errors persist on the same chips.

The watchdog reports 0 hashrate across all 3 chains when the loop happens.

I'm about to start testing the hashboards individually to isolate the faulty one.

Question:

What could be causing these specific chips (34, 50, 62) to show NaN status? Is it a failed BM1387 chip "poisoning" the bus, or should I be looking at the LDOs/voltage domains?

reddit.com
u/DumBDiego — 2 days ago