Introduction
This document describes the process that is used in order to replace a single APIC in a fabric cluster that fails due to a hardware issue.
Problem
There is an operable Application Centric Infrastructure (ACI) fabric that exists and an Application Policy Infrastructure Controller (APIC) that has failed. The failure is determined to be related to a hardware issue and the entire unit must be replaced.
Solution
Complete these steps in order to resolve this issue:
- Identify the failed APIC and the current fabric settings:
- From the web interface of an operational APIC, choose System > Controllers.
- On the left-hand side of the screen, choose Controllers > (any APIC) > Cluster.
- The failed APIC appears as Unavailable in the Operational State column. Take note of the Fabric Name, Target Size, and Node ID for the failed APIC, as well as the Tunnel End Point (TEP) address space:
Tip: You can also enter the acidiag avread command into the CLI of the APIC in order to obtain this information.
- Decommission the failed APIC:
- Highlight the failed APIC.
- From the Actions drop-down list, choose Decommission. The APIC can now change to an Out of Service Admin state.
- Remove the failed APIC from your rack and install the replacement. The new APIC can boot to the initial setup script.
- Use the information that you gathered in Step 1 in order to match the values of the failed APIC and proceed through the setup script.
Note: Ensure that you use the same configuration settings that you noted from the old APIC (such as the Fabric Name, Controller ID, and TEP Address Pool). Failure to configure the APIC with the same settings can provoke the fabric to enter a partially diverged state. Additionally, the replacement APIC must run the same version of ACI software as the remaining two APICs in order to join the cluster.
- Commission the new APIC:
- Once the APIC has booted up, highlight the currently Out of Service APIC on the Cluster page.
- From the Actions drop-down list, choose Commission.
The APIC receives an IP address, which is reflected in the web interface of the APIC.
Note: It can take up to ten minutes before this occurs. The new APIC can also cycle between the Available and Unavailable Operational States before its Health State appears as Fully Fit.
- In order to verify that the new APIC has joined the fabric, use the CLI of the new APIC in order to log into the fabric. Use the credentials that are configured for the rest of the fabric when you log in.