Locate failed PCI device and disable InfiniBand interface port on Linux

Posted on Wed 03 September 2025 by Pavlo Khmel

Server "server-a" PowerEdge XE9680 is hanging on boot with the message:

"UEFI0031: PCIe downtrain is detected on PCIe Device Slot 36, (Bus:0x5B Dev:0x02 F:0x00). Expected link width: x16 and actual link width : x4"

It is possible to press F1 to continue booting.

This issue will be solved with Dell support. But for now, there is an issue with GPUDirect RDMA. Server has 8 x GPUs and 10 x InfiniBand cards.

nccl-test benchmark build guide: https://pavlokhmel.com/enable-gpudirect-rdma-and-benchmark-with-perftest-nccl-test-nvidia-hpcg-pytorch-resnet50-osu.html

nccl-test gets slow and different performance results (GPUDirect enabled):

mpirun -n 16 --host server-a:8,server-b:8 ./nccl-tests-2.16.7/build/all_reduce_perf -b 8 -e 8G -f 2 -g 1
# Avg bus bandwidth    : 4.22649 
# Avg bus bandwidth    : 8.40644
# Avg bus bandwidth    : 29.2344

Find Bus Address for "Slot 36":

# dmidecode -t slot | grep -B 2 -A 12 "Slot 36"
Handle 0x0904, DMI type 9, 24 bytes
System Slot Information
    Designation: PCIe Slot 36
    Type: PCI Express 5
    Data Bus Width: 16x or x16
    Current Usage: In Use
    Length: Short
    ID: 36
    Characteristics:
        3.3 V is provided
        PME signal is supported
    Bus Address: 0000:5e:00.0
    Data Bus Width (Base): 0
    Peer Devices: 0
    Height: Not applicable

Find device with bus address 0000:5e:00.0

# lspci -D | grep '0000:5e:00.0'
0000:5e:00.0 Infiniband controller: Mellanox Technologies MT28908 Family [ConnectX-6]

Find InfiniBand card Linux name:

# grep '0000:5e:00.0' /sys/class/net/*/device/uevent
/sys/class/net/ibp94s0/device/uevent:PCI_SLOT_NAME=0000:5e:00.0

Find InfiniBand GUID (a MAC equivalent) of ibp94s0:

# ip link show ibp94s0
10: ibp94s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 2044 qdisc mq state UP mode DEFAULT group default qlen 1000
    link/infiniband 00:00:10:27:fe:80:00:00:00:00:00:00:a0:88:c2:03:00:ae:21:58 brd 00:ff:ff:ff:ff:12:40:1b:ff:ff:00:00:00:00:00:00:ff:ff:ff:ff
[root@idun-01-03 ~]# 

Find IndiniBand device name with the same GUID:

# ibstatus | grep -B 1 -A 6 2158
Infiniband device 'mlx5_4' port 1 status:
    default gid:     fe80:0000:0000:0000:a088:c203:00ae:2158
    base lid:    0x152
    sm lid:      0x3
    state:       4: ACTIVE
    phys state:  5: LinkUp
    rate:        100 Gb/sec (2X HDR)
    link_layer:  InfiniBand

Find mlx5_4 base LID and port:

# ibstat mlx5_4
CA 'mlx5_4'
    CA type: MT4123
    Number of ports: 1
    Firmware version: 20.43.1014
    Hardware version: 0
    Node GUID: 0xa088c20300ae2158
    System image GUID: 0xa088c20300ae2158
    Port 1:
        State: Active
        Physical state: LinkUp
        Rate: 100
        Base lid: 338
        LMC: 0
        SM lid: 3
        Capability mask: 0xa651e848
        Port GUID: 0xa088c20300ae2158
        Link layer: InfiniBand

Disable mlx5_4:

ibportstate 338 1 disable

Start nccl-test benchmark again:

$ mpirun -n 16 --host server-a:8,server-b:8 ./nccl-tests-2.16.7/build/all_reduce_perf -b 8 -e 8G -f 2 -g 1
. . .
# Avg bus bandwidth    : 62.4008 

Performance improved.

Use this command to re-enable mlx5_4 again:

mlxfwreset -d mlx5_4 -l 3 reset