Achieving Network High Availability with Cisco Embedded Automation Technologies
PDF(342.7 KB) View with Adobe Reader on a variety of devices
Updated:Mar 09, 2010
As the Internet makes its way into every aspect of our work and lives, businesses around the world increasingly rely on their networks for mission-critical applications. Any network downtime translates into a significant financial impact on the organization. At the same time, networks are becoming more and more complex. Managing an ever-growing network while maintaining high availability at the lowest cost becomes a tremendous challenge for network administrators.
In this document, we discuss how customers can use the power of embedded automation technologies that come natively with their Cisco® network devices to build high availability solutions.
The embedded automation technologies, such as Embedded Event Manager (EEM), IP service-level agreement (IP SLA), Tool Command Language (Tcl), Embedded Syslog Manager (ESM), and Embedded Resource Manager (ERM), are a set of Cisco IOS® Software features that are unique to Cisco network devices. They provide Cisco's customers and partners a powerful and flexible platform on which they can build innovative solutions that are fully integrated with their network management system to increase the availability of their networks.
Using these technologies, users can program intelligent logic inside of network devices to automatically detect, diagnose, and recover from network failures. Compared with solutions offered by external network management stations, such solutions are more reliable and scalable. The event-driven model supported by EEM allows the system to respond instantaneously to network failures and take actions.
An example use case is in-service DOM (Digital Optical Monitoring) Threshold Monitoring on ME 3400 platforms. EEM scripts make use of show commands to periodically check key statuses, such as temperature and voltage, and to report errors when the thresholds are passed. Such an automated proactive approach lets customers conduct in-service monitoring, increase network uptime, and reduce their operating expenses (OpEx). Another example is a service provider who uses EEM to monitor interface errors so that when a threshold is exceeded the interface is gracefully shut down in an effort to reroute traffic. One of Cisco's large enterprise customers in the retail industry uses EEM to periodically verify IP connectivity between the company's retail stores all over the world and its data center backend servers. EEM is used to send out real-time alerts when connectivity is lost.
Cisco IOS Service Diagnostics Scripts
Cisco has also developed, as part of the Service Diagnostics program, a set of EEM scripts that detect common network problems and automate best-practice troubleshooting processes. In Service Diagnostics 1.0, we introduced four sets of scripts for Border Gateway Protocol (BGP), Open Shortest Path First (OSPF), quality of service (QoS), and System Resource Monitoring. In Service Diagnostics 2.0, which was recently released, we introduced another set of scripts targeting automated diagnostics and recovery based on IEEE 802.1ag (Connectivity Fault Management, CFM). These scripts are officially signed and supported by Cisco and can be freely downloaded at
http://www.cisco.com/en/US/prod/iosswrel/ps6537/ps6555/ps9424/cisco_ios_service_diagnostics_scripts.html. For more details on the Cisco Service Diagnostics scripts, please refer to
Cisco Embedded Automation Systems
Figure 1. Cisco EASy
Cisco Embedded Automation Systems (EASy)brings together representatives from Cisco IOS technology groups, domain experts customer support engineers, and field consulting engineers to provide solutions using Cisco IOS embedded management tools to enhance Cisco's products so they are:
• EASy to install
• EASy to manage
• EASy to service
EASy (Figure 1) uses Embedded Event Manager and related embedded automation technologies such as Tcl, IOS.sh, Embedded Menu Manager (EMM), and IP SLA-all highly differentiated features that are widely available on various Cisco routing and switching devices.
A use case with accompanying scripts and tutorial was recently introduced by the EASy program under high availability. The following is a brief description of the use case.
A customer wishes to run a customized set of command-line interface (CLI) commands when a primary path is no longer available and then run a different set of customized CLI commands when it becomes available again.
Using the EASy
Connectivity Verification using IP SLA and EEM solution, users can easily configure IP SLA and insert their own business logic without requiring any understanding of the configuration tasks involved for IP SLA, EEM, or Tcl scripting. This is accomplished through the easy-to-use, text-based
Easy-Installer interface, which uses Tclsh.
Your submittal will be carefully reviewed by EASy team members and a group of subject matter experts. If the work required is feasible and can be generalized for consumption by a large customer base, an EASy solution will be assigned to someone, and work will proceed on a best-effort basis.
If you'd like to be a part of this dynamic effort or learn more, email
Case Study: Wireless Link State Monitoring and Failover
In this case study, we will show how the high availability solution built with Cisco IOS Embedded Event Manager helps a premier railroad franchise stay on track.
How do you improve high availability on rugged Cisco mobile access routers that are strapped to 6500-ton freight trains traveling at speeds between 50 and 75 mph across a hodgepodge of wireless and cellular networks each providing varying and, often, unpredictable SLAs with pay-as-you-go usage plans-all while ensuring only the most cost-effective path is always used?
A large railroad company demonstrates how it has embraced Cisco innovation to solve this very problem while increasing IP network availability and reducing OpEx. See Figure 2.
Figure 2. Improving High Availability
In the example above, we have a rugged Cisco 3200 Mobile Access Router (MAR) that:
• Connects to either a Wi-Fi or digital cell network (or both based on geographic location)
• Subscribes to two data service plans where usage on the digital cell network is more expensive than on the Wi-Fi network
• Uses a primary path that relies on Dynamic Host Configuration Protocol (DHCP) to learn the Gateway of Last Resort (GOLR)
• Requires Network Address Translation (NAT) to hide private IP addresses
This particular customer had very strict requirements. First, the primary (Wi-Fi) path had always to be preferred over the backup (digital cell) path in order to minimize usage costs. However, the primary path could not be reused if it failed over to the backup path. Only when the backup path became unreachable could the primary path be used again.
They also wanted the router to install a default route based on the DHCP binding learned on the primary interface. If the DHCP address of the primary interface ever changed, the router needed to reevaluate and reconfigure the default route to reflect the new next hop if changed. In addition, in the event that the DHCP address was not learned within 5 minutes at bootup, the router should utilize the available backup path.
Finally, in the event the primary path failed, the router should autoconfigure itself so the NAT that is used across the primary path would be used across the backup path.
Solution: Improved High Availability Using Embedded Event Manager (EEM) and IP SLA Event Detection
Figure 3. Improved High Availability Using IP SLA Event Detection
Using a combination of IP SLA and a set of customizable EEM scripts, all the customer's high availability requirements were met (Figure 3).
The solution allowed this particular customer to:
• Dynamically configure and administratively cost static default routes based on DHCP
• Intelligently reroute traffic based on IP SLA using Internet Control Message Protocol (ICMP) echoes
• Automate the configuration of NAT based on path availability
• Prevent the primary link from being reused when its SLA thresholds are ever exceeded
The customer is using these scripts in production without any issues and plans on improving the solution further to include a tertiary link.
In summary, embedded automation technologies are a set of highly differentiating technologies in Cisco IOS Software. Collectively they provide powerful mechanisms for customers to build highly effective high availability solutions using Cisco networking devices. Such solutions offer both OpEx and capital expenditures (CapEx) benefit by using customers' existing investment in network infrastructure, increasing network availability, and reducing the complexity of routine network management tasks.