Published On: July 14ᵗʰ, 2021 08:10

System Management Configuration Guide, Cisco IOS XE Amsterdam 17.3.x (Catalyst 9500 Switches)

Contents

Information About Configuring Online Diagnostics

With online diagnostics, you can test and verify the hardware functionality of a device while the device is connected to a live network. Online diagnostics contains packet-switching tests that check different hardware components and verify the data path and control signals.

Online diagnostics detects problems in these areas:

  • Hardware components

  • Interfaces (Ethernet ports and so forth)

  • Solder joints

Online diagnostics are categorized as on-demand, scheduled, or health-monitoring diagnostics. On-demand diagnostics run from the CLI; scheduled diagnostics run at user-designated intervals or at specified times when the device is connected to a live network; and health-monitoring runs in the background with user-defined intervals. The health-monitoring test runs every 90, 100, or 150 seconds based on the test.

After you configure online diagnostics, you can manually start diagnostic tests or display the test results. You can also see which tests are configured for the device and the diagnostic tests that have already run.

Generic Online Diagnostics (GOLD) Tests


Note
  • Before you enable online diagnostics tests, enable console logging to see all the warning messages.

  • While tests are running, all the ports are shut down because a stress test is being performed with looping ports internally, and external traffic might affect the test results. Reboot the switch to bring it to normal operation. When you run the command to reload a switch, the system will ask you if the configuration should be saved. Do not save the configuration.

  • If you are running tests on other modules, after a test is initiated and complete, you must reset the module.


The following sections provide information about GOLD tests.

Cisco Catalyst 9500 Series Switches

DiagGoldPktTest

This GOLD packet loopback test verifies the MAC-level loopback functionality. In this test, a GOLD packet is sent, for which Unified Access Data Plane (UADP) ASIC provides support in the hardware. The packet loops back at the MAC-level and is matched against the stored packet.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Run this on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors and linecards.

DiagThermalTest

This test verifies the temperature reading from a device sensor.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Do not disable. Run this as an on-demand test, and as a health-monitoring test if the administrator is down.

Default

On.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors and linecards.

DiagFanTest

This test verifies if all the fan modules that have been inserted are working properly on the board.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Run this as a health-monitoring test if you experience a problem with the fan module.

Default

On.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors.

DiagPhyLoopbackTest

This PHY loopback test verifies the PHY-level loopback functionality. In this test, a packet, which loops back at the PHY level and is matched against the stored packet, is sent. It cannot be run as a health-monitoring test.


Note

In certain cases when this test is run on-demand, ports are moved to the error-disabled state. In such cases, use the shut and no shut command in interface configuration mode to reenable these ports.


Attribute

Description

Disruptive or Nondisruptive

Disruptive.

Recommendation

If the link to the external connector is down, run this on-demand test to check the health of the link.

Default

Off.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors and linecards.

DiagScratchRegisterTest

This Scratch Register test monitors the health of ASICs by writing values into registers and reading back the values from these registers.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. Run this test if the task of writing values to the registers fails. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors and linecards.

DiagPoETest

This test checks the PoE controller functionality. Do not perform this test during normal switch operation.

Attribute

Description

Disruptive or Nondisruptive

Disruptive.

Recommendation

Run this test if you experience PoE controller issues with a port. This can be run only as an on-demand test.

Default

Off.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Linecards.

DiagStackCableTest

This test verifies the stack-ring loopback functionality in the stacking environment. It cannot be run as a health-monitoring test.

Attribute

Description

Disruptive or Nondisruptive

Disruptive.

Recommendation

Run this test to verify the stack-ring loopback functionality in the stacking environment.

Default

Off.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

If the test fails, check the stack cables and connectors.

Hardware support

Supervisors.

DiagMemoryTest

This exhaustive ASIC memory test is run during normal switch operation. The corresponding switch utilizes memory built-in self-test for this test. The memory test requires switch reboot after the test.

Attribute

Description

Disruptive or Nondisruptive

Very disruptive.

Recommendation

Run this on-demand test only if you experience memory-related problems in the system. Do not run this test if you do not want to reload the Supervisor engine that is under test.

Default

Off.

Intitial release

Cisco IOS XE Everest 16.6.1.

Corrective action

Hardware support

Supervisors.

TestUnusedPortLoopback

This test periodically verifies the data path between the supervisor module and network ports of a module during runtime to determine if any incoming network interface ports are locked. In this test, a Layer 2 packet is flooded on to the VLAN associated with the test port and the inband port of the supervisor engine. The packet loops back into the test port and returns to the supervisor engine on the same VLAN. This test runs only on unused (admin down, that is, the ports are shut down) network ports irrespective of whether a cable is connected or not, and completes within a millisecond per port. This test substitutes the lack of a nondisruptive loopback test in the current ASICs, and runs every 60 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This test is automatically disabled during CPU usage spikes to maintain accuracy.

Default

On.

Intitial release

Cisco IOS XE Fuji 16.9.1.

Corrective action

Displays a syslog message indicating that a port has failed. In modules other than supervisor engines, if all port groups fail (for example, at least one port per port ASIC fails more than the failure threshold for all the port ASICs), the default action is to reset the module and power down the module after two resets.

Hardware support

Supervisors and linecards.

TestPortTxMonitoring

This test periodically monitors data-path traffic in the transmitted direction of each network port that is physically connected to a device with status as UP. This test is completed within a millisecond per port. It monitors the transmit counters at the ASIC level to verify that the ports are not stuck. It also displays syslog messages, and users can take corrective actions using the Cisco IOS Embedded Event Manager (EEM).

Configure the time interval and threshold by entering the diagnostic monitor interval and diagnostic monitor threshold commands, respectively. The test leverages the Cisco Discovery Protocol that transmits packets. The test runs every 75 seconds, and the failure threshold is set to 5 seconds by default.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable.

Default

On.

Intitial release

Cisco IOS XE Everest 16.9.1.

Corrective action

Displays a syslog message indicating that a port has failed.

Hardware support

All modules, including supervisor engines.

Cisco Catalyst 9500 Series High Performance Switches

TestGoldPktLoopback

This GOLD packet loopback test verifies the MAC-level loopback functionality. In this test, a GOLD packet is sent, for which Unified Access Data Plane (UADP) ASIC provides support in the hardware. The packet loops back at the MAC-level and is matched against the stored packet.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Run this on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Fuji 16.8.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All modules.

TestOBFL

This test verifies the on-board failure-logging capabilities. During this test, a diagnostic message is logged to the Onboard Failure Logging (OBFL).

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Run this on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Gibraltar 16.10.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All modules.

TestFantray

This test verifies if the fan tray has been inserted, and is working properly on the board. This test runs every 100 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Fuji 16.8.1.

Corrective action

Displays a syslog message if the fan tray is not present or any of the fans fail.

Hardware support

All modules.

TestPhyLoopback

This PHY loopback test verifies the PHY-level loopback functionality. In this test, a packet, which loops back at the PHY level and is matched against the stored packet, is sent. It cannot be run as a health-monitoring test.

Attribute

Description

Disruptive or Nondisruptive

Disruptive.

Recommendation

Run this as an on-demand test as per requirement.

Default

Off.

Intitial release

Cisco IOS XE Fuji 16.8.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All modules.

TestThermal

This test verifies the temperature reading from a device sensor if it is below the yellow temperature threshold. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive

Recommendation

Do not disable. Run this as an on-demand test and a health-monitoring test.

Default

On.

Intitial release

Cisco IOS XE Fuji 16.8.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

All modules.

TestScratchRegister

This Scratch Register test monitors the health of ASICs by writing values into registers and reading back the values from these registers. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Fuji 16.8.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

All modules.

TestConsistencyCheck

This test checks if the hardware programming is correct. This test checks with the forwarding object manager to identify incomplete entries or long-pending configurations to the hardware. This test runs every 90 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Amsterdam 17.2.1.

Corrective action

Displays a syslog message if the test fails.

Hardware support

All modules.

TestPortTxMonitoring

This test monitors the transmit counters of a connected interface. It verifies if a connected port is able to send packets or not. This test runs every 150 seconds.

Attribute

Description

Disruptive or Nondisruptive

Nondisruptive.

Recommendation

Do not disable. This can be run as a health-monitoring test and also as an on-demand test.

Default

On.

Intitial release

Cisco IOS XE Gibraltar 16.10.1.

Corrective action

Displays a syslog message if the test fails for a port.

Hardware support

All modules.

How to Configure Online Diagnostics

The following sections provide information about the various procedures that comprise the online diagnostics configuration.

Starting Online Diagnostic Tests

After you configure diagnostic tests to run on a device, use the diagnostic start privileged EXEC command to begin diagnostic testing.

After starting the tests, you cannot stop the testing process midway.

Use the diagnostic start switch privileged EXEC command to manually start online diagnostic testing:

Procedure

Command or Action Purpose

diagnostic start switch number test {name | test-id | test-id-range | all | basic | complete | minimal | non-disruptive | per-port}

Example:



Device# diagnostic start switch 2 test basic

Starts the diagnostic tests.

You can specify the tests by using one of these options:

  • name : Enters the name of the test.

  • test-id : Enters the ID number of the test.

  • test-id-range : Enters the range of test IDs by using integers separated by a comma and a hyphen.

  • all : Starts all of the tests.

  • basic : Starts the basic test suite.

  • complete : Starts the complete test suite.

  • minimal : Starts the minimal bootup test suite.

  • non-disruptive : Starts the nondisruptive test suite.

  • per-port : Starts the per-port test suite.

Configuring Online Diagnostics

You must configure the failure threshold and the interval between tests before enabling diagnostic monitoring.

Scheduling Online Diagnostics

You can schedule online diagnostics to run at a designated time of day, or on a daily, weekly, or monthly basis for a device. Use the no form of the diagnostic schedule switch command to remove the scheduling.

Procedure

  Command or Action Purpose
Step 1

configure terminal

Example:

Device #configure terminal

Enters global configuration mode.

Step 2

diagnostic schedule switch number test {name | test-id | test-id-range | all | basic | complete | minimal | non-disruptive | per-port} {daily | on mm dd yyyy hh:mm | port inter-port-number port-number-list | weekly day-of-week hh:mm}

Example:


Device(config)# diagnostic schedule switch 3 test 1-5 on July 3 2013 23:10


Schedules on-demand diagnostic test for a specific day and time.

When specifying the test to be scheduled, use these options:

  • name : Name of the test that appears in the show diagnostic content command output.

  • test-id : ID number of the test that appears in the show diagnostic content command output.

  • test-id-range : ID numbers of the tests that appear in the show diagnostic content command output.

  • all : All test IDs.

  • basic : Starts the basic on-demand diagnostic tests.

  • complete : Starts the complete test suite.

  • minimal : Starts the minimal bootup test suite.

  • non-disruptive : Starts the nondisruptive test suite.

  • per-port : Starts the per-port test suite.

You can schedule the tests as follows:

  • Daily: Use the daily hh:mm parameter.

  • Specific day and time: Use the on mm dd yyyy hh:mm parameter.

  • Weekly: Use the weekly day-of-week hh:mm parameter.

Configuring Health-Monitoring Diagnostics

You can configure health-monitoring diagnostic testing on a device while it is connected to a live network. You can configure the execution interval for each health-monitoring test, enable the device to generate a syslog message because of a test failure, and enable a specific test.

Use the no form of this command to disable testing.

By default, health monitoring is enabled only for a few tests, and the device generates a syslog message when a test fails.

Follow these steps to configure and enable the health-monitoring diagnostic tests:

Procedure

  Command or Action Purpose
Step 1

enable

Example:


Device> enable


Enables privileged EXEC mode.

Enter your password, if prompted.

Step 2

configure terminal

Example:


Device# configure terminal


Enters global configuration mode.

Step 3

diagnostic monitor interval switch number test {name | test-id | test-id-range | all} hh:mm:ss milliseconds day

Example:


Device(config)# diagnostic monitor interval switch 2 test 1 12:30:00 750 5

Configures the health-monitoring interval of the specified test.

When specifying a test, use one of these parameters:

  • name : Name of the test that appears in the show diagnostic content command output.

  • test-id : ID number of the test that appears in the show diagnostic content command output.

  • test-id-range : ID numbers of the tests that appear in the show diagnostic content command output.

  • all : All the diagnostic tests.

When specifying the interval, set these parameters:

  • hh:mm:ss : Monitoring interval, in hours, minutes, and seconds. The range for hh is 0 to 24, and the range for mm and ss is 0 to 60.

  • milliseconds : Monitoring interval, in milliseconds (ms). The range is from 0 to 999.

  • day : Monitoring interval, in number of days. The range is from 0 to 20.

Step 4

diagnostic monitor syslog

Example:


Device(config)# diagnostic monitor syslog

(Optional) Configures the switch to generate a syslog message when a health-monitoring test fails.

Step 5

diagnostic monitor threshold switch number number test {name | test-id | test-id-range | all} failure count count

Example:


Device(config)# diagnostic monitor threshold switch 2 test 1 failure count 20

(Optional) Sets the failure threshold for the health-monitoring test.

When specifying the tests, use one of these parameters:

  • name : Name of the test that appears in the show diagnostic content command output.

  • test-id : ID number of the test that appears in the show diagnostic content command output.

  • test-id-range : ID numbers of the tests that appear in the show diagnostic content command output.

  • all : All the diagnostic tests.

The range for the failure threshold count is 0 to 99.

Step 6

diagnostic monitor switchnumber test {name | test-id | test-id-range | all}

Example:


Device(config)# diagnostic monitor switch 2 test 1

Enables the specified health-monitoring tests.

The switch number keyword is supported only on stacking switches.

When specifying the tests, use one of these parameters:

  • name : Name of the test that appears in the show diagnostic content command output.

  • test-id : ID number of the test that appears in the show diagnostic content command output.

  • test-id-range : ID numbers of the tests that appear in the show diagnostic content command output.

  • all : All the diagnostic tests.

Step 7

end

Example:


Device(config)# end


Returns to privileged EXEC mode.

Step 8

show diagnostic { content | post | result | schedule | status | switch }

(Optional) Display the online diagnostic test results and the supported test suites.

Step 9

show running-config

Example:


Device# show running-config 


(Optional) Verifies your entries.

Step 10

copy running-config startup-config

Example:


Device# copy running-config startup-config 


(Optional) Saves your entries in the configuration file.

Monitoring and Maintaining Online Diagnostics

You can display the online diagnostic tests that are configured for a device or a device stack and check the test results by using the privileged EXEC show commands in this table:

Table 1. Commands for Diagnostic Test Configuration and Results

Command

Purpose

show diagnostic content switch [number | all]

The below command applies to the C9500-12Q, C9500-16X, C9500-24Q, C9500-40X models of the Cisco Catalyst 9500 Series Switches.

show diagnostic content

Displays the online diagnostics configured for a switch.

show diagnostic status

Displays the diagnostic tests that are running currently. .

show diagnostic result switch [number | all] [detail | test {name | test-id | test-id-range | all} [detail]]

Displays the online diagnostics test results.

show diagnostic switch [number | all] [detail]

The below command applies to the C9500-12Q, C9500-16X, C9500-24Q, C9500-40X models of the Cisco Catalyst 9500 Series Switches.

show diagnostic detail]

Displays the online diagnostics test results.

show diagnostic schedule [number | all]

Displays the online diagnostics test schedule.

show diagnostic post

The below command applies to the C9500-12Q, C9500-16X, C9500-24Q, C9500-40X models of the Cisco Catalyst 9500 Series Switches.

show post

Displays the POST results. (The output is the same as the show post command output.)

show diagnostic events {event-type | module}

Displays diagnostic events such as error, information, or warning based on the test result.

show diagnostic description module [number] test { name | test-id | all }

Displays the short description of the results from an individual test or all the tests.

Configuration Examples for Online Diagnostics

The following sections provide examples of online diagnostics configurations.

Examples: Start Diagnostic Tests

This example shows how to start a diagnostic test by using the test name:


Device# 
diagnostic start switch 2 test DiagFanTest



This example shows how to start all of the basic diagnostic tests:


Device# diagnostic start switch 1 test all


Example: Configure a Health-Monitoring Test

This example shows how to configure a health-monitoring test:


Device(config)# diagnostic monitor threshold switch 1 test 1 failure count 50
Device(config)# diagnostic monitor interval switch 1 test TestPortAsicStackPortLoopback


Example: Schedule Diagnostic Test

This example shows how to schedule diagnostic testing for a specific day and time on a specific switch:

Device(config)# diagnostic schedule test DiagThermalTest on June 3 2013  22:25

This example shows how to schedule diagnostic testing to occur weekly at a certain time on a specific switch:

Device(config)# diagnostic schedule switch 1 test 1,2,4-6 weekly saturday 10:30

Example: Displaying Online Diagnostics

This example shows how to display on-demand diagnostic settings:

Device# show diagnostic ondemand settings

Test iterations = 1
Action on test failure = continue


This example shows how to display diagnostic events for errors:


Device# show diagnostic events event-type error

Diagnostic events (storage for 500 events, 0 events recorded)
Number of events matching above criteria = 0

No diagnostic log entry exists.


This example shows how to display the description for a diagnostic test:


Device# show diagnostic description switch 1 test all

DiagGoldPktTest : 
        The GOLD packet Loopback test verifies the MAC level loopback
        functionality. In this test, a GOLD packet, for which doppler
        provides the support in hardware, is sent. The packet loops back
        at MAC level and is matched against the stored packet. It is a non
        -disruptive test.

DiagThermalTest : 
        This test verifies the temperature reading from the sensor is below the yellow
        temperature threshold. It is a non-disruptive test and can be run as a health monitoring test.

DiagFanTest : 
        This test verifies all fan modules have been inserted and working properly on the board
        It is a non-disruptive test and can be run as a health monitoring test.

DiagPhyLoopbackTest : 
        The PHY Loopback test verifies the PHY level loopback
        functionality. In this test, a packet is sent which loops back
        at PHY level and is matched against the stored packet. It is a 
        disruptive test and cannot be run as a health monitoring test.

DiagScratchRegisterTest : 
        The Scratch Register test monitors the health of application-specific
        integrated circuits (ASICs) by writing values into registers and reading
        back the values from these registers. It is a non-disruptive test and can
        be run as a health monitoring test.

DiagPoETest : 
        This test checks the PoE controller functionality. This is a disruptive test
        and should not be performed during normal switch operation.


DiagMemoryTest : 
        This test runs the exhaustive ASIC memory test during normal switch operation
        NG3K utilizes mbist for this test. Memory test is very disruptive
        in nature and requires switch reboot after the test.

Device#

 
The below example is not applicable to the C9500-12Q, C9500-16X, C9500-24Q, C9500-40X models of the Cisco Catalyst 9500 Series Switches. This example shows how to display the boot up level:

Device# show diagnostic bootup level
 
Current bootup diagnostic level: minimal

Device#


Additional References for Online Diagnostics

Related Documents

Related Topic Document Title

For complete syntax and usage information for the commands used in this chapter.

Command Reference (Catalyst 9500 Series Switches)

Feature Information for Configuring Online Diagnostics

This table provides release and related information for features explained in this module.

These features are available on all releases subsequent to the one they were introduced in, unless noted otherwise.

Release

Feature

Feature Information

Cisco IOS XE Everest 16.5.1a

Online Diagnostics

With online diagnostics, you can test and verify the hardware functionality of the device while the device is connected to a live network.

Support for this feature was introduced only on the C9500-12Q, C9500-16X, C9500-24Q, C9500-40X models of the Cisco Catalyst 9500 Series Switches.

Cisco IOS XE Fuji 16.8.1a

Online Diagnostics

Support for this feature was introduced only on the C9500-32C, C9500-32QC, C9500-48Y4C, and C9500-24Y4C models of the Cisco Catalyst 9500 Series Switches.

Use Cisco Feature Navigator to find information about platform and software image support. To access Cisco Feature Navigator, go to http://www.cisco.com/go/cfn.