This document describes how to troubleshoot Power Supply Unit (PSU) failure in Cisco NCS XR Platform.
Cisco recommends that you have knowledge of these topics:
Note: Cisco recommends that you must have access to Cisco IOS XR CLI and admin CLI.
The information in this document is based on these software and hardware versions (this includes, but is not limited to, these series):
The information in this document was created from the devices in a specific lab environment. All of the devices used in this document started with a cleared (default) configuration. If your network is live, ensure that you understand the potential impact of any command.
The Cisco NCS XR router series includes several platforms designed for different use cases and performance levels, each with distinct power supply architectures:
Cisco NCS 540 Series: This is a small-density XR router aimed at sub-100G bandwidth applications such as 5G NR backhaul, FTTx, and enterprise branch deployments. Some models in this series utilize fixed power supplies with 1+1 AC/DC redundancy, meaning the power supply units are integrated into the chassis and are not field-replaceable. Other NCS 540 models can feature modular power supplies.
Cisco NCS 560 Series: This modular system includes modular power supplies with AC and DC options, supporting load-sharing and protection schemes. These power supplies are typically field-serviceable and hot-swappable, allowing for replacement without system shutdown and ensuring high availability.
Cisco NCS 5500 Series: This high fault-resilient modular router platform is designed for data center and high-performance networking environments. It features modular, field-replaceable PSUs that support serviceability and redundancy. The platform supports Cisco IOS XR software with modular packages and resiliency features.
Cisco NCS 5700 Series: Building on the NCS 5500 platform, this series includes enhanced forwarding ASIC design and runs Cisco IOS XR7 OS. The system is modular with field-replaceable PSUs and supports high availability and fault resilience. PSUs are designed for redundancy and hot-swapping. Cisco IOS XR7 OS provides advanced software features that monitor system and fault management.
The PSU or Power Tray (PT) that consists of PMs in Cisco NCS XR routers is a critical hardware component responsible for converting and providing stable electrical power to the system. PSUs/PTs are often hot-swappable and support redundancy and load sharing. Multiple PSUs can be installed to provide backup power in case one module fails, thereby increasing system availability and minimizing downtime.
A failed or undetected PSU can cause system errors, prevent line cards from booting properly, and lead to system instability or complete shutdown. This can severely impact the operation and network service continuity of the router. The nature and severity of problems vary by platform due to differences in PSU design and serviceability. For models with fixed PSUs (for example, some NCS 540 series), a failure typically requires service or replacement of the entire unit, leading to longer downtime. Modular systems (for example, NCS 560, 5500, 5700, and some 540 models) allow for continued operation during single PSU failures and enable easier maintenance without system shutdown.
Procedure to Resolve PSU Failure in NCS XR Platform
The troubleshooting procedure for PSU failures in NCS XR platforms generally outlines a consistent approach, with specific physical actions differing based on whether the model uses a fixed PSU or a modular PSU.
Login to the router in Cisco IOS XR CLI and execute these commands to identify the status of PSUs. These commands are common across all NCS XR platforms running Cisco IOS XR.
Step 1.1. Check Platform Status: Run this command to identify if it is a PSU failure.
Sample Command Output:
RP/0/RP0/CPU0:NCS-540-B-LNT#show platform
Thu Dec 11 10:06:59.917 +0530
Node Type State Config state
--------------------------------------------------------------------------------
0/RP0/CPU0 N540X-16Z4G8Q2C-D(Active) IOS XR RUN NSHUT
0/PM0 N540-PSU-FIXED-D OPERATIONAL NSHUT
0/PM1 N540-PSU-FIXED-D OFFLINE NSHUT
0/FT0 N540-X-BB-FAN OPERATIONAL NSHUT
Note: If all the Power Modules (for example, `0/PM0`, `0/PM1`) are in 'OPERATIONAL' state, then you can conclude the power supply works fine. Else, if any Power Module is Non-Operational or in a failed state, it implies a PSU failure.
Step 1.2. Identify failed Power Modules: Run this command to check the status and details of individual PSUs.
RP/0/RP0/CPU0:NCS-540-B-LNT#show environment power
Thu Dec 11 12:50:16.275 +0530
================================================================================
CHASSIS LEVEL POWER INFO: 0
================================================================================
Total output power capacity : 300W
Total output power required : 175W
Total power input : N/A
Total power output : 97W
================================================================================
Power Supply Status
Module Type
================================================================================
0/PM1 N540-PSU-FIXED-D OFFLINE
0/PM0 N540-PSU-FIXED-D OK
RP/0/RP0/CPU0:KOL_ISK_901_1AC_M_CNCS540R543#
Note: A status of 'FAILED' or 'NO POWER' for a power module, or significantly low/zero input/output values compared to other modules, indicates a failed or failing power supply.
Step 1.3. Verify Power Module Failure from Alarms: Run this command to check system alarms for power-related alarms.
RP/0/RP0/CPU0:NCS-540-B-LNT#show alarms brief
Thu Dec 11 12:50:02.667 +0530
show alarms brief system active
--------------------------------------------------------------------------------
Active Alarms for 0/RP0
--------------------------------------------------------------------------------
Location Severity Group Set Time Description
--------------------------------------------------------------------------------
0/PM1 Major Environ 10/19/2025 12:30:42 +0530 Power Module Generic Fault (PM_GENERIC_FAULT)
0/PM1 Major Environ 10/19/2025 12:30:42 +0530 Power Module Error (PM_I2C_ACCESS_ERROR)
0 Major Environ 10/19/2025 12:30:42 +0530 Power Group redundancy lost
--------------------------------------------------------------------------------
Note: Alarm messages indicating 'Power Group Redundancy Lost' or 'Power Module Error' confirm fan failures.
Environmental factors can significantly impact power supply operation and overall system stability.
1. Ambient Conditions:
Verify ambient temperature and airflow around the router to ensure it is within operational limits. High temperatures can cause power supplies to overheat, reduce their efficiency, and lead to premature failure.
Check for any obstructions to airflow around the PSUs and the chassis vents. Ensure proper ventilation and heat dissipation pathways are clear.
Confirm that the power source (for example, AC outlet, DC power feed) is stable and within the specified voltage and current ranges for the NCS series router.
2. Physical Inspection for Obstructions/Damage:
Inspect the PSUs for any visible debris, loose wiring, or obstructions that can impede connectivity.
Before proceeding with hardware replacement, it is advisable to check if the observed power module failure aligns with any known software or hardware bugs.
The next steps depend on the type of PSU in your NCS XR Series router.
Models with fixed PSUs are typically not hot-swappable.
Note: Replacement of a fixed PS requires planned downtime as the router must be powered down.
These platforms feature hot-swappable modular PSUs.
1. Reseating (JACK-OUT and JACK-IN (JOJI)):
Carefully perform a JOJI procedure on the power module that is experiencing issues. This involves physically removing the power module and then re-inserting it.
2. Replacement RMA: If the issue is isolated to the PT or power module, and reseating does not resolve the problem, it likely indicates a hardware failure. In such cases, customers can raise a case with Cisco TAC for verification. Upon confirmation, Cisco TAC will assess the situation and verify the logs in order to initiate an RMA for the affected PT or power module. Alternatively, if your service level agreement includes direct or automated hardware replacement, the RMA process can proceed automatically without additional verification.
Sample logs:
0/RP0/ADMIN0:Nov 26 06:20:32.269 UTC: shelf_mgr[3081]: %INFRA-SHELF_MGR-5-CARD_REMOVAL : Location: 0/PM0, Serial#: DTMXXXXXX
0/RP0/ADMIN0:Nov 26 06:20:32.269 UTC: envmon[3021]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Output Disabled :CLEAR :0/PM0: Power module is under HW_OUTPUT_DISABLED condition.
0/RP0/ADMIN0:Nov 26 06:20:32.269 UTC: envmon[3021]: %PKT_INFRA-FM-6-FAULT_INFO : Power Module removal :INFO :0/PM0:
0/RP0/ADMIN0:Nov 26 06:20:59.052 UTC: envmon[3021]: %PKT_INFRA-FM-6-FAULT_INFO : Power Module insertion :INFO :0/PM0:
0/RP0/ADMIN0:Nov 26 06:20:59.053 UTC: shelf_mgr[3081]: %INFRA-SHELF_MGR-5-CARD_INSERTION : Location: 0/PM0, Serial #:DTMXXXXXX
0/RP0/ADMIN0:Nov 26 06:20:59.053 UTC: envmon[3021]: %PKT_INFRA-FM-3-FAULT_MAJOR : ALARM_MAJOR :Power Module Output Disabled :DECLARE :0/PM0: Power module is under HW_OUTPUT_DISABLED condition.
0/RP0/ADMIN0:Nov 26 06:20:59.053 UTC: shelf_mgr[3081]: %INFRA-SHELF_MGR-6-HW_EVENT : Rcvd HW event HW_EVENT_FAILURE, event_reason_str 'No Input or HW Power Failure' for card 0/PM0
Sample Command Output:
Command Syntax:
RP/0/RP0/CPU0:NCS-560-B#show inventory location <location of the failed power module>
Sample command:
RP/0/RP0/CPU0:NCS-560-B#show inventory location 0/PM0
Thu Dec 25 20:41:18.031 KST
NAME: "0/PM0", DESCR: "ASR 900 1200W AC Power Supply"
PID: A900-PWR1200-A , VID: V03 , SN: DCAXXXXXX
RP/0/RP0/CPU0:NCS-560-B#
| Revision | Publish Date | Comments |
|---|---|---|
1.0 |
27-Apr-2026
|
Initial Release |