The Cisco NFVI
management node hosts the Cisco VIM Rest API service, Cobbler for PXE services,
ELK for Logging/Kibana dashboard services and VMTP for cloud validation.
Because the maintenance node currently does not have redundancy, understanding
its points of failure and recovery scenarios is important. These are described
in this topic.
The management node
architecture includes a Cisco UCS C240 M4 server with dual CPU socket. It has a
1 Gbps on-board (LOM) NIC and a 10 Gbps Cisco VIC mLOM. HDDs are used in 8,16,
or 24 disk configurations. The following figure shows a high level maintenance
node hardware and software architecture.
Figure 1. Cisco NFVI
Management Node Architecture
Different management
node hardware or software failures can cause Cisco NFVI service disruptions and
outages. Some failed services can be recovered through manual intervention. In
cases where the system is operational during a failure, double faults might not
be recoverable. The following table lists different management node failure
scenarios and their recovery options.
Table 1 Management Node
Failure Scenarios
Scenario #
|
Failure/Trigger
|
Recoverable?
|
Operational
Impact
|
1
|
Failure of 1
or 2 active HDD
|
Yes
|
No
|
2
|
Simultaneous
failure of more than 2 active HDD
|
No
|
Yes
|
3
|
Spare HDD
failure: 4 spare for 24 HDD; or 2 spare for 8 HDD
|
Yes
|
No
|
4
|
Power
outage/hard reboot
|
Yes
|
Yes
|
5
|
Graceful
reboot
|
Yes
|
Yes
|
6
|
Docker daemon
start failure
|
Yes
|
Yes
|
7
|
Service
container (Cobbler, ELK) start failure
|
Yes
|
Yes
|
8
|
One link
failure on bond interface
|
Yes
|
No
|
9
|
Two link
failures on bond interface
|
Yes
|
Yes
|
10
|
REST API
service failure
|
Yes
|
No
|
11
|
Graceful
reboot with Cisco VIM Insight
|
Yes
|
Yes; CLI
alternatives exist during reboot.
|
12
|
Power
outage/hard reboot with Cisco VIM Insight
|
Yes
|
Yes
|
13
|
VIM Insight
Container reinstallation
|
Yes
|
Yes; CLI
alternatives exist during re-insight.
|
14
|
Cisco VIM
Insight Container reboot
|
Yes
|
Yes; CLI
alternatives exist during reboot.
|
15
|
Intel 1350
1Gbps LOM failure
|
Yes
|
Yes
|
16
|
Cisco VIC
1227 10 Gbps mLOM failure
|
Yes
|
Yes
|
17
|
DIMM memory
failure
|
Yes
|
No
|
18
|
One CPU
failure
|
Yes
|
No
|
Scenario 1: Failure of one
or two active HDDs
The management node
has either 8,16, or 24 HDDs. The HDDs are configured with RAID 6, which helps
enable data redundancy and storage performance and overcomes any unforeseen HDD
failures.
-
When 8 HDDs are
installed, 7 are active disks and one is spare disk.
-
When 16 HDDs
are installed, 14 are active disks and two are spare disks.
-
When 24 HDDs
are installed, 20 are active disks and four are spare disks.
With RAID 6 up to
two simultaneous active HDD failures can occur. When an HDD fails, the system
starts automatic recovery by moving the spare disk to active state and starts
recovering and rebuilding the new active HDD. It takes approximately four hours
to rebuild the new disk and move to synchronized state. During this operation,
the system is completely functional and no impacts are seen. However, you must
monitor the system to ensure that additional failures do not occur to enter
into a double fault situation.
You can use the
storcli commands to check the disk and RAID state as
shown below:
Note |
Make sure the
node is running with hardware RAID by checking the storcli output and comparing
to the one preceding.
|
[root@mgmt-node ~]# /opt/MegaRAID/storcli/storcli64 /c0 show
<…snip…>
TOPOLOGY:
========
-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type State BT Size PDC PI SED DS3 FSpace TR
-----------------------------------------------------------------------------
0 - - - - RAID6 Optl N 4.087 TB dflt N N dflt N N
0 0 - - - RAID6 Optl N 4.087 TB dflt N N dflt N N <== RAID 6 in optimal state
0 0 0 252:1 1 DRIVE Onln N 837.258 GB dflt N N dflt - N
0 0 1 252:2 2 DRIVE Onln N 837.258 GB dflt N N dflt - N
0 0 2 252:3 3 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 3 252:4 4 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 4 252:5 5 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 5 252:6 6 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 6 252:7 7 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 - - 252:8 8 DRIVE DHS - 930.390 GB - - - - - N
-----------------------------------------------------------------------------
<…snip…>
PD LIST:
=======
-------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
-------------------------------------------------------------------------
252:1 1 Onln 0 837.258 GB SAS HDD N N 512B ST900MM0006 U <== all disks functioning
252:2 2 Onln 0 837.258 GB SAS HDD N N 512B ST900MM0006 U
252:3 3 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:4 4 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:5 5 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:6 6 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:7 7 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:8 8 DHS 0 930.390 GB SAS HDD N N 512B ST91000640SS D
-------------------------------------------------------------------------
[root@mgmt-node ~]# /opt/MegaRAID/storcli/storcli64 /c0 show
<…snip…>
TOPOLOGY :
========
-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type State BT Size PDC PI SED DS3 FSpace TR
-----------------------------------------------------------------------------
0 - - - - RAID6 Pdgd N 4.087 TB dflt N N dflt N N <== RAID 6 in degraded state
0 0 - - - RAID6 Dgrd N 4.087 TB dflt N N dflt N N
0 0 0 252:8 8 DRIVE Rbld Y 930.390 GB dflt N N dflt - N
0 0 1 252:2 2 DRIVE Onln N 837.258 GB dflt N N dflt - N
0 0 2 252:3 3 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 3 252:4 4 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 4 252:5 5 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 5 252:6 6 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 6 252:7 7 DRIVE Onln N 930.390 GB dflt N N dflt - N
-----------------------------------------------------------------------------
<…snip…>
PD LIST :
=======
-------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
-------------------------------------------------------------------------
252:1 1 UGood - 837.258 GB SAS HDD N N 512B ST900MM0006 U <== active disk in slot 1 disconnected from drive group 0
252:2 2 Onln 0 837.258 GB SAS HDD N N 512B ST900MM0006 U
252:3 3 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:4 4 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:5 5 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:6 6 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:7 7 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:8 8 Rbld 0 930.390 GB SAS HDD N N 512B ST91000640SS U <== spare disk in slot 8 joined drive group 0 and in rebuilding state
-------------------------------------------------------------------------
[root@mgmt-node ~]# /opt/MegaRAID/storcli/storcli64 /c0/e252/s8 show rebuild
Controller = 0
Status = Success
Description = Show Drive Rebuild Status Succeeded.
------------------------------------------------------
Drive-ID Progress% Status Estimated Time Left
------------------------------------------------------
/c0/e252/s8 20 In progress 2 Hours 28 Minutes <== spare disk in slot 8 rebuild status
------------------------------------------------------
To replace the
failed disk and add it back as a spare:
[root@mgmt-node ~]# /opt/MegaRAID/storcli/storcli64 /c0/e252/s1 add hotsparedrive dg=0
Controller = 0
Status = Success
Description = Add Hot Spare Succeeded.
[root@mgmt-node ~]# /opt/MegaRAID/storcli/storcli64 /c0 show
<…snip…>
TOPOLOGY :
========
-----------------------------------------------------------------------------
DG Arr Row EID:Slot DID Type State BT Size PDC PI SED DS3 FSpace TR
-----------------------------------------------------------------------------
0 - - - - RAID6 Pdgd N 4.087 TB dflt N N dflt N N
0 0 - - - RAID6 Dgrd N 4.087 TB dflt N N dflt N N
0 0 0 252:8 8 DRIVE Rbld Y 930.390 GB dflt N N dflt - N
0 0 1 252:2 2 DRIVE Onln N 837.258 GB dflt N N dflt - N
0 0 2 252:3 3 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 3 252:4 4 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 4 252:5 5 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 5 252:6 6 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 0 6 252:7 7 DRIVE Onln N 930.390 GB dflt N N dflt - N
0 - - 252:1 1 DRIVE DHS - 837.258 GB - - - - - N
-----------------------------------------------------------------------------
<…snip…>
PD LIST :
=======
-------------------------------------------------------------------------
EID:Slt DID State DG Size Intf Med SED PI SeSz Model Sp
-------------------------------------------------------------------------
252:1 1 DHS 0 837.258 GB SAS HDD N N 512B ST900MM0006 U <== replacement disk added back as spare
252:2 2 Onln 0 837.258 GB SAS HDD N N 512B ST900MM0006 U
252:3 3 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:4 4 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:5 5 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:6 6 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:7 7 Onln 0 930.390 GB SAS HDD N N 512B ST91000640SS U
252:8 8 Rbld 0 930.390 GB SAS HDD N N 512B ST91000640SS U
-------------------------------------------------------------------------
Scenario 2:
Simultaneous
failure of more than two active HDDs
If more than two HDD
failures occur at the same time, the management node goes into an unrecoverable
failure state because RAID 6 allows for recovery of up to two simultaneous HDD
failures. To recover the management node, reinstall the operating system.
Scenario 3: Spare HDD
failure
When the management
node has 24 HDDs, four are designated as spares. Failure of any of the disks
does not impact the RAID or system functionality. Cisco recommends replacing
these disks when they fail (see the steps in Scenario 1) to serve as standby
disks and so when an active disk fails, an auto-rebuild is triggered.
Scenario 4: Power
outage/hard reboot
If a power outage or
hard system reboot occurs, the system will boot up and come back to operational
state. Services running on management node during down time will be disrupted.
See the steps in Scenario 9 for the list of commands to check the services
status after recovery.
Scenario 5: System
reboot
If a graceful
system reboot occurs, the system will boot up and come back to operational
state. Services running on management node during down time will be disrupted.
See the steps in Scenario 9 for the list of commands to check the services
status after recovery.
Scenario 6: Docker daemon
start failure
The management node
runs the services using Docker containers. If the Docker daemon fails to come
up, it causes services such as ELK, Cobbler and VMTP to go into down state. You
can use the
systemctl
command to check the status of the Docker daemon, for example:
# systemctl status docker
docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2016-08-22 00:33:43 CEST; 21h ago
Docs: http://docs.docker.com
Main PID: 16728 (docker)
If the Docker
daemon is in down state, use the
systemctl restart
docker
command to restart the Docker service. Run the commands listed in
Scenario 9 to verify that all the Docker services are active.
Scenario 7: Service
container (Cobbler, ELK) start failure
As described in
Scenario 8, all the services run as Docker containers on the management node.
To find all services running as containers, use the
docker ps –a
command. If any services are in Exit state, use the
systemctl
command and grep for Docker to find the exact service name, for example:
# systemctl | grep docker- | awk '{print $1}'
docker-cobbler-tftp.service
docker-cobbler-web.service
docker-cobbler.service
docker-container-registry.service
docker-elasticsearch.service
docker-kibana.service
docker-logstash.service
docker-vmtp.service
If any services need
restarting, use the
systemctl
command. For example, to restart a Kibana service:
# systemctl restart docker-kibana.service
Scenario 8: One link
failure on the bond Interface
The management node
is set up with two different networks: br_api and br_mgmt. The br_api interface
is the external. It is used for accessing outside services such as the Cisco
VIM REST API, Kibana and Cobbler. The br_mgmt interface is internal. It is used
for provisioning and to provide management connectivity to all OpenStack nodes
(control, compute and storage). Each network has two ports that are bonded to
provide redundancy. If one port fails, the system will remain completely
functional through the other port. If a port fails, check for physical network
connectivity and/or remote switch configuration to debug the underlying cause
of the link failure.
Scenario 9: Two link
failures on the bond Interface
As described in
Scenario 10, each network is configured with two ports. If both ports are down,
the system is not reachable and management node services could be disrupted.
After the ports are back up, the system is fully operational. Check the
physical network connectivity and/or the remote switch configuration to debug
the underlying link failure cause.
Scenario 10: REST API
service failure
The management node
runs the REST API service for Cisco VIM clients to reach the server. If the
REST service is down, Cisco VIM clients cannot reach the server to trigger any
server operations. However, with the exception of the REST service, other
management node services remain operational.
To verify the
management node REST services are fully operational, use the following command
to check that the httpd and mercury-restapi services are in active and running
state:
# systemctl status httpd
httpd.service - The Apache HTTP Server
Loaded: loaded (/usr/lib/systemd/system/httpd.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2016-08-22 00:22:10 CEST; 22h ago
# systemctl status mercury-restapi.service
mercury-restapi.service - Mercury Restapi
Loaded: loaded (/usr/lib/systemd/system/mercury-restapi.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2016-08-22 00:20:18 CEST; 22h ago
A tool is also
provided so that you can check the REST API server status and the location of
the directory it is running from. To execute run the following command:
# cd installer-<tagid>/tools
#./restapi.py -a status
Status of the REST API Server: active (running) since Thu 2016-08-18 09:15:39 UTC; 9h ago
REST API launch directory: /root/installer-<tagid>/
Confirm the server
status is active and check that the restapi launch directory matches the
directory where the installation was launched. The restapi tool also provides
the options to launch, tear down, and reset password for the restapi server as
shown below:
# ./restapi.py –h
usage: restapi.py [-h] --action ACTION [--yes] [--verbose]
REST API setup helper
optional arguments:
-h, --help show this help message and exit
--action ACTION, -a ACTION
setup - Install and Start the REST API server.
teardown - Stop and Uninstall the REST API
server.
restart - Restart the REST API server.
regenerate-password - Regenerate the password for
REST API server.
reset-password - Reset the REST API password with
user given password.
status - Check the status of the REST API server
--yes, -y Skip the dialog. Yes to the action.
--verbose, -v Perform the action in verbose mode.
If the REST API
server is not running, execute
ciscovim to
show the following error message:
# cd installer-<tagid>/
# ciscovim -setupfile ~/Save/<setup_data.yaml> run
If the installer
directory or the REST API state is not correct or points to an incorrect REST
API launch directory, go to the installer-<tagid>/tools directory and
execute:
# ./restapi.py –action setup
To confirm that the
REST API server state and launch directory is correct run the following
command:
# ./restapi.py –action status
Scenario 11: Graceful
reboot with Cisco VIM Insight
Cisco VIM Insight
runs as a container on the management node. After a graceful reboot of the
management node, the VIM Insight and its associated database containers will
come up. So there is no impact on recovery.
Scenario 12: Power outage
or hard reboot with VIM Insight
The Cisco VIM
Insight container will come up automatically following a power outage or hard
reset of the management node.
Scenario 13: Cisco VIM
Insight reinstallation
If the management
node which is running the Cisco VIM Insight fails and cannot come up, you must
uninstall and reinstall the Cisco VIM Insight. After the VM Insight container
comes up, add the relevant bootstrap steps as listed in the install guide to
register the pod. VIM Insight then automatically detects the installer status
and reflects the current status appropriately.
To clean up and
reinstall Cisco VIM Insight run the following command:
# cd /root/installer-<tagid>/insight/
# ./bootstrap_insight.py –a uninstall –o standalone –f </root/insight_setup_data.yaml>
Scenario 14: VIM Insight
Container reboot
On Reboot of the
VIM Insight container, services will continue to work as it is.
Scenario 15: Intel (I350)
1Gbps LOM failure
The management node
is set up with an Intel (I350) 1 Gbps LOM for API connectivity. Two 1 Gbps
ports are bonded to provide connectivity redundancy. No operational impact
occurs if one of these ports goes down. However, if both ports fail, or the LOM
network adapter fails, the system cannot be reached through the API IP address.
If this occurs you must replace the server because the LOM is connected to the
system motherboard. To recover the management node with a new server, complete
the following steps. Make sure the new management node hardware profile matches
the existing server and the Cisco IMC IP address is assigned.
-
Shut down the
existing management node.
-
Unplug the power
from the existing and new management nodes.
-
Remove all HDDs
from existing management node and install them in the same slots of the new
management node.
-
Plug in the
power to the new management node, but do not boot the node.
- Verify the configured boot
order is set to boot from local HDD.
-
Verify the Cisco
NFVI management VLAN is configured on the Cisco VIC interfaces.
-
Boot the
management node for the operating system to start.
After the
management node is up, the management node bond interface will be down due to
the incorrect MAC address on the ifcfg files. It will point to old node network
card MAC address.
-
Update the MAC
address on the ifcfg files under /etc/sysconfig/network-scripts.
-
Reboot the
management node.
It will come up
and be fully operational. All interfaces should be in an up state and be
reachable.
-
Verify that
Kibana and Cobbler dashboards are accessible.
-
Verify the Rest
API services are up. See Scenario 15 for any recovery steps.
Scenario 16: Cisco VIC 1227
10Gbps mLOM failure
The management node
is configured with a Cisco VIC 1227 dual port 10 Gbps mLOM adapter for
connectivity to the other Cisco NFVI nodes. Two 10 Gbps ports are bonded to
provide connectivity redundancy. If one of the 10 Gbps ports goes down, no
operational impact occurs. However, if both Cisco VIC 10 Gbps ports fail, the
system goes into an unreachable state on the management network. If this
occurs, you must replace the VIC network adapters. Otherwise pod management and
the Fluentd forwarding service will be disrupted. If you replace a Cisco VIC,
update the management and provisioning VLAN for the VIC interfaces using Cisco
IMC and update the MAC address in the interfaces under
/etc/sysconfig/network-scripts interface configuration file.
Scenario 17: DIMM memory
failure
The management node
is set up with multiple DIMM memory across different slots. Failure of one or
memory modules could cause the system to go into unstable state, depending on
how many DIMM memory failures occur. DIMM memory failures are standard system
failures like any other Linux system server. If a DIMM memory fails, replace
the memory module(s) as soon as possible to keep the system in stable state.
Scenario 18: One CPU
failure
Cisco NFVI
management nodes have dual core Intel CPUs (CPU1 and CPU2). If one CPU fails,
the system remains operational. However, always replace failed CPU modules
immediately. CPU failures are standard system failures like any other Linux
system server. If a CPU fails, replace it immediately to keep the system in
stable state.