Portal
Language
 
Home>Knowledge Base>Performance Related>Handling PCI Express Errors
Information
Article ID55
Created On10/21/2009
Modified10/21/2009
Share With Others
Handling PCI Express Errors

Handling PCI Express Errors

Problem

Data errors may be introduced when data is passed across the bus.

Solution

PCI Express detects, and in many situations, can correct data errors. These errors can also be detected by running fio-pci-check.

If such errors have been detected on your system, the following steps are recommended:

1. Reseat all risers and the ioDrives in the system.

2. Run some significant data to and from the ioDrive in question and check for errors again.

3. If the errors continue, the most common culprit is a riser card, so try swapping that out.

4. If errors persist, try swapping out the motherboard, and finally the ioDrive.

The ioDrive causes PCI errors to show up faster than other devices do. Because of its high performance, it is very demanding on the PCI Express bus, showing errors in the system that other devices might not be able to uncover.

If multiple cards are installed in the system, the fio-status and fio-beacon utilities can be used to determine which of the drives is failing. fio-status returns the serial number for the device, and fio-beacon turns on the beacon LED for identification.

Caution: Some PCI Express chips do not properly report PCI Express errors, and they may report errors when none exist. In most cases this has been found to happen on a bridge chip. This failure typically shows under the following conditions:

  • Multiple rapid executions of fio-pci-check have been issued.
  • No data has been passing over the bus reporting errors.
  • All drivers for attached peripherals are unloaded.

Below is an example of PCI Express errors captured on a system with an ioDrive.

Figure 14 – PCI Express errors