Cisco ASR 1000 Series Aggregation Services Routers Operations and Maintenance Guide
Monitoring the Control Plane
Downloads: This chapterpdf (PDF - 270.0KB) The complete bookPDF (PDF - 1.13MB) | Feedback

Monitoring the Control Plane

Table Of Contents

Monitoring the Control Plane

Avoiding Problems Through Regular Monitoring

Control Plane Overview

Cisco ASR 1000 Series Routers Control Plane Architecture

Cisco IOS XE Software Architecture

Monitoring Control Plane Resources

IOS Process Resources

Overall Control Plane Resources

For More Information


Monitoring the Control Plane


To verify the overall health of your system, monitor control plane resources on a regular basis.

This chapter includes the following sections:

Avoiding Problems Through Regular Monitoring

Control Plane Overview

Monitoring Control Plane Resources

For More Information

Avoiding Problems Through Regular Monitoring

Monitoring system resources allows you to detect potential problems before they happen, thus avoiding outages. The following are show the advantages of regular monitoring:

In a real-life example, customers installed new line cards. After the line cards were in operation for a few years, lack of memory on those line cards caused major outages in some cases. Monitoring memory usage would have identified a memory issue and avoided an outage.

Regular monitoring establishes a baseline for a normal system load. You can use this information as a basis for comparison when you upgrade hardware or software—to see if the upgrade has affected resource usage.

Control Plane Overview

The following sections contain a high-level overview of the control plane:

Cisco ASR 1000 Series Routers Control Plane Architecture

Cisco IOS XE Software Architecture

Cisco ASR 1000 Series Routers Control Plane Architecture

The major components in the control plane are:

Cisco ASR 1000 Series Route Processor (RP)—A general purpose CPU responsible for routing protocols, CLI, network management interfaces, code storage, logging, and chassis management. The Cisco ASR 1000 Series RPs process network control packets as well as protocols not supported by the Cisco ASR 1000 Series ESP.

Cisco ASR 1000 Series Embedded Services Processor (ESP)—A forwarding processor that handles forwarding control plane traffic, and performs packet processing functions such as firewall inspection, ACLs, encryption, and QoS.

Cisco ASR 1000 Series SPA Interface Processor (SIP)—An interface processor that provides the connection between the Route Processor and the shared port adapters (SPAs).

Distributed Control Plane Architecture

Cisco ASR 1000 Series Routers have a distributed control plane architecture. A separate control processor is embedded on each major component in the control plane, as shown in Figure 5-1:

Route Processor (RP)

Forwarding Engine Control Processor (FECP)

I/O Control Processor (IOCP)

The RP manages and maintains the control plane using a dedicated Gigabit Ethernet out-of-band channel (EOBC). The internal EOBC is used to continuously exchange system state information among the different major components. For example, in the event of a failure condition, a switchover event occurs and the standby RP and ESP are immediately ready to assume the data forwarding functions or the control plane functions for the failed component.

The inter-integrated circuit (I2C) monitors the health of hardware components. The Enhanced SerDes Interconnect (ESI) is a set of serial links that are the data path links on the midplane connecting the RP, SIPs, and standby ESPs to the active ESP.

Figure 5-1 Cisco ASR 1000 Series Routers Control Plane Architecture

The control plane processors perform the following functions:

RP

Runs the router control plane (Cisco IOS), including processing network control packets, computing routes, and setting up connections.

Monitors interface and environmental status, including management ports, LEDs, alarms, and SNMP network management.

Downloads code to other components in the system.

Selects the active RP and ESP and synchronizes the standby RP and ESP.

Manages logging facilities, on-board failure logging (OBFL), and statistics aggregation.

FECP

Provides direct CPU access to the forwarding engine subsystem—the Cisco QuantumFlowProcessor (QFP) subsystem—that is the forwarding processor chipset and also resides on the ESP.

Manages the forwarding engine subsystem and its connection to I/O.

Manages the forwarding processor chipset.

IOCP

Provides direct CPU access to SPAs installed in a SIP.

Manages the SPAs.

Handles SPA online insertion and removal (OIR) events.

Runs SPA drivers that initialize and configure SPAs.

Cisco IOS XE Software Architecture

The control plane processors run Cisco IOS XE software, which is an operating system that consists of a Linux-based kernel and a common set of operating system-level utility programs. It is a distributed software architecture that moves many operating system responsibilities out of the IOS process.

In this architecture, IOS runs as one of many Linux processes while allowing other Linux processes to share responsibility for running the router. IOS runs as a user process on the RP. Hardware-specific components have been removed from the IOS process and are handled by separate middleware processes in Cisco IOS XE software. If a hardware-specific issue is discovered, the middleware process can be modified without touching the IOS process.

Figure 5-2 shows the main components of the Cisco IOS XE software architecture. This modular architecture increases network resiliency by distributing operating responsibility among separate processes. The architecture also allows for better allocation of memory so the router can run more efficiently.

All of the Cisco IOS XE software modules run in their own protective memory spaces, which facilitates fault containment. Any software outages of an individual software module are localized to that particular module. All other software processes continue to operate. For example, for each SPA, a separate driver process is executed on the SIP, even if multiple SPAs of the same type are present. Because each SPA driver runs in its own protective memory, failure or upgrade of an individual driver is localized to the affected SPA.

Figure 5-2 Cisco IOS XE Software Architecture

Using the Linux architecture, Cisco IOS XE provides the following benefits:

The ability to integrate multi-core (multiple CPUs on a single piece of silicon) processors.

The IOS process has no direct access to hardware components, thus providing a greater level of resiliency.

The ability to run active and standby IOS processes on the non-hardware-redundant Cisco ASR 1004 Router and Cisco ASR 1006 Router.

The IOS process operates as a virtual machine under the RP Linux kernel. Upon bootup, the RP Linux kernel allocates 50 percent of available memory to IOS processes as a one-time event. For systems that have a single IOS process, IOS is allocated approximately 45 percent of total RP memory. For redundant IOS process systems, each IOS process is allocated approximately 20 percent of total RP memory.

Hardware components are managed through memory-protected middleware processes.

SPA drivers run as unique processes allowing the ability to upgrade and restart individual SPAs.

Monitoring Control Plane Resources

The following sections discuss monitoring memory and CPU from the perspective of the IOS process and from the perspective of the overall control plane:

IOS Process Resources

Overall Control Plane Resources

IOS Process Resources

For information about memory and CPU utilization from within the IOS process, use the show memory command and the show process cpu command. Note that these commands provide a representation of memory and CPU utilization from the perspective of the IOS process only; they do not include information for resources on the entire route processor. For example, show memory on an RP2 with 8 GB of RAM running a single IOS process shows the following memory usage:

Router# show memory

                Head       Total(b)     Used(b)     Free(b)     Lowest(b)   Largest(b)
Processor  2ABEA4316010   4489061884   314474916   4174586968   3580216380   3512323496
 lsmpi_io  2ABFAFF471A8     6295128     6294212         916         916         916
Critical   2ABEB7C72EB0     1024004          92     1023912     1023912     1023912

For the dual-core RP2, the show process cpu command reports a single IOS CPU utilization average using both processors:

Router# show process cpu

CPU utilization for five seconds: 0%/0%; one minute: 0%; five minutes: 0%
 PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process
   1         583     48054         12  0.00%  0.00%  0.00%   0 Chunk Manager
   2         991    176805          5  0.00%  0.00%  0.00%   0 Load Meter
   3           0         2          0  0.00%  0.00%  0.00%   0 IFCOM Msg Hdlr
   4           0        11          0  0.00%  0.00%  0.00%   0 Retransmission o
   5           0         3          0  0.00%  0.00%  0.00%   0 IPC ISSU Dispatc
   6      230385    119697       1924  0.00%  0.01%  0.00%   0 Check heaps
   7          49        28       1750  0.00%  0.00%  0.00%   0 Pool Manager
   8           0         2          0  0.00%  0.00%  0.00%   0 Timers
   9       17268    644656         26  0.00%  0.00%  0.00%   0 ARP Input
  10         197    922201          0  0.00%  0.00%  0.00%   0 ARP Background
  11           0         2          0  0.00%  0.00%  0.00%   0 ATM Idle Timer
  12           0         1          0  0.00%  0.00%  0.00%   0 ATM ASYNC PROC
  13           0         1          0  0.00%  0.00%  0.00%   0 AAA_SERVER_DEADT
  14           0         1          0  0.00%  0.00%  0.00%   0 Policy Manager
  15           0         2          0  0.00%  0.00%  0.00%   0 DDR Timers
  16           1        15         66  0.00%  0.00%  0.00%   0 Entity MIB API
  17          13      1195         10  0.00%  0.00%  0.00%   0 EEM ED Syslog
  18          93        46       2021  0.00%  0.00%  0.00%   0 PrstVbl
  19           0         1          0  0.00%  0.00%  0.00%   0 RO Notify Timers

Overall Control Plane Resources

For information about control plane memory and CPU utilization on each control processor, use the show platform software status control-processor brief command (summary view) or the show platform software status control-processor command (detailed view).

All control processors should show a status of Healthy. Other possible status values are Warning and Critical. Warning indicates that the router is operational but that the operating level should be reviewed. Critical implies that the router is near failure.

If you see a status of Warning or Critical, take the following actions:

Reduce static and dynamic loads on the system by reducing the number of elements in the configuration or by limiting the capacity for dynamic services.

Reduce the number of routes and adjacencies, limit the number of ACLs and other rules, reduce the number of VLANs, and so on.

The following sections describe the fields in show platform software status control-processor command output.

Load Average

Load average represents the process queue or process contention for CPU resources. For example, on a single-core processor, an instantaneous load of 7 would mean that seven processes are ready to run, one of which is currently running. On a dual-core processor, a load of 7 would represent seven processes are ready to run, two of which are currently running.

Memory Utilization

Memory utilization is represented by the following fields:

Total—Total line card memory

Used—Consumed memory

Free—Available memory

Committed—Virtual memory committed to processes

CPU Utilization

CPU utilization is an indication of the percentage of time the CPU is busy and is represented by the following fields:

CPU—The allocated processor

User—Non-Linux kernel processes

System —Linux kernel process

Nice—Low priority processes

Idle—Percentage of time the CPU was inactive

IRQ—Interrupts

SIRQ—System Interrupts

IOwait—Percentage of time CPU was waiting for I/O

The following are examples of the show platform software status control-processor command.

Router# show platform software status control-processor brief
Load Average
 Slot  Status  1-Min  5-Min 15-Min
  RP0 Healthy   0.25   0.30   0.44
  RP1 Healthy   0.31   0.19   0.12
 ESP0 Healthy   0.01   0.05   0.02
 ESP1 Healthy   0.03   0.05   0.01
 SIP1 Healthy   0.15   0.07   0.01
 SIP2 Healthy   0.03   0.03   0.00

Memory (kB)
 Slot  Status    Total     Used (Pct)     Free (Pct) Committed (Pct)
  RP0 Healthy  3722408  2514836 (60%)  1207572 (29%)   1891176 (45%)
  RP1 Healthy  3722408  2547488 (61%)  1174920 (28%)   1889976 (45%)
 ESP0 Healthy  2025468  1432088 (68%)   593380 (28%)   3136912 (149%)
 ESP1 Healthy  2025468  1377980 (65%)   647488 (30%)   3084412 (147%)
 SIP1 Healthy   480388   293084 (55%)   187304 (35%)    148532 (28%)
 SIP2 Healthy   480388   273992 (52%)   206396 (39%)     93188 (17%)

CPU Utilization
 Slot  CPU   User System   Nice   Idle    IRQ   SIRQ IOwait
  RP0    0  30.12   1.69   0.00  67.63   0.13   0.41   0.00
  RP1    0  21.98   1.13   0.00  76.54   0.04   0.12   0.16
 ESP0    0  13.37   4.77   0.00  81.58   0.07   0.19   0.00
 ESP1    0   5.76   3.56   0.00  90.58   0.03   0.05   0.00
 SIP1    0   3.79   0.13   0.00  96.04   0.00   0.02   0.00
 SIP2    0   3.50   0.12   0.00  96.34   0.00   0.02   0.00


Router# show platform software status control-processor
RP0: online, statistics updated 10 seconds ago
Load Average: healthy
  1-Min: 0.30, status: healthy, under 5.00
  5-Min: 0.31, status: healthy, under 5.00
  15-Min: 0.47, status: healthy, under 5.00
Memory (kb): healthy
  Total: 3722408
  Used: 2514776 (60%), status: healthy, under 90%
  Free: 1207632 (29%), status: healthy, over 10%
  Committed: 1891176 (45%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User: 30.12, System:  1.69, Nice:  0.00, Idle: 67.63
  IRQ:  0.13, SIRQ:  0.41, IOwait:  0.00

RP1: online, statistics updated 5 seconds ago
Load Average: healthy
  1-Min: 0.14, status: healthy, under 5.00
  5-Min: 0.11, status: healthy, under 5.00
  15-Min: 0.09, status: healthy, under 5.00
Memory (kb): healthy
  Total: 3722408
  Used: 2547488 (61%), status: healthy, under 90%
  Free: 1174920 (28%), status: healthy, over 10%
  Committed: 1889976 (45%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User: 21.98, System:  1.13, Nice:  0.00, Idle: 76.54
  IRQ:  0.04, SIRQ:  0.12, IOwait:  0.16

ESP0: online, statistics updated 5 seconds ago
Load Average: healthy
  1-Min: 0.06, status: healthy, under 5.00
  5-Min: 0.09, status: healthy, under 5.00
  15-Min: 0.03, status: healthy, under 5.00
Memory (kb): healthy
  Total: 2025468
  Used: 1432088 (68%), status: healthy, under 90%
  Free: 593380 (28%), status: healthy, over 10%
  Committed: 3136912 (149%), status: healthy, under 300%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User: 13.37, System:  4.77, Nice:  0.00, Idle: 81.58
  IRQ:  0.07, SIRQ:  0.19, IOwait:  0.00

ESP1: online, statistics updated 5 seconds ago
Load Average: healthy
  1-Min: 0.22, status: healthy, under 5.00
  5-Min: 0.08, status: healthy, under 5.00
  15-Min: 0.02, status: healthy, under 5.00
Memory (kb): healthy
  Total: 2025468
  Used: 1377980 (65%), status: healthy, under 90%
  Free: 647488 (30%), status: healthy, over 10%
  Committed: 3084412 (147%), status: healthy, under 300%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User:  5.76, System:  3.56, Nice:  0.00, Idle: 90.58
  IRQ:  0.03, SIRQ:  0.05, IOwait:  0.00

SIP1: online, statistics updated 6 seconds ago
Load Average: healthy
  1-Min: 0.05, status: healthy, under 5.00
  5-Min: 0.06, status: healthy, under 5.00
  15-Min: 0.00, status: healthy, under 5.00
Memory (kb): healthy
  Total: 480388
  Used: 293084 (55%), status: healthy, under 90%
  Free: 187304 (35%), status: healthy, over 10%
  Committed: 148532 (28%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User:  3.79, System:  0.13, Nice:  0.00, Idle: 96.04
  IRQ:  0.00, SIRQ:  0.02, IOwait:  0.00

SIP2: online, statistics updated 8 seconds ago
Load Average: healthy
  1-Min: 0.03, status: healthy, under 5.00
  5-Min: 0.03, status: healthy, under 5.00
  15-Min: 0.00, status: healthy, under 5.00
Memory (kb): healthy
  Total: 480388
  Used: 273992 (52%), status: healthy, under 90%
  Free: 206396 (39%), status: healthy, over 10%
  Committed: 93188 (17%), status: healthy, under 90%
Per-core Statistics
CPU0: CPU Utilization (percentage of time spent)
  User:  3.50, System:  0.12, Nice:  0.00, Idle: 96.34
  IRQ:  0.00, SIRQ:  0.02, IOwait:  0.00

For More Information

For more information about the topics discussed in this chapter, see the following documents:

Topic
Document

Command descriptions

Cisco IOS Master Command List, All Releases

Command Lookup Tool (Requires Cisco.com user ID and password)