The goal of achieving closed-loop automation for network services is important in today’s networks, but the time has come to move the discussion to the next level, defining Service Level Agreements (SLAs) and metrics for closed-loop automation. The three relevant metrics for closed-loop automation are Mean Time To Detect (MTTD), Mean Time To Identify (MTTI), and Mean Time To Repair (MTTR). These metrics will define the effectiveness of a closed-loop automation solution.
Cisco Crosswork™ applications, which include Health Insights, Change Automation, and Network Services Orchestrator (NSO) and any Incident Management application, can help Service Providers achieve the desired SLAs for the key metrics for successful closed-loop automation. Let’s explore how to achieve these outcomes.
● Mean Time to Detect (MTTD)
The first step to fixing network service is to detect the issues affecting the network service in real time. This involves identifying the objects of the network service to be monitored and initiating monitoring of the objects as part of service activation.
Cisco Crosswork Health Insights: Cisco Crosswork Health Insights is a network health application that performs real-time Key Performance Indicator (KPI) monitoring, alerting, and troubleshooting. Cisco Crosswork Health Insights enables defining the right set of Key Performance Indicators (KPIs), a data construct that captures network device objects health metrics. KPIs allow monitoring objects related to the network service, allowing customization of KPI thresholds, monitoring the KPI’s and raising alerts if the thresholds are exceeded. These KPI’s can be enabled as part of service provisioning using Cisco Network Services Orchestrator (NSO), thus monitoring the health of the network service from the time it is commissioned.
● Mean Time to Identify (MTTI)
Finding which network services are affected and correlating it to the alerts is where network operators spend most of their time. MTTI can be significantly reduced if the application can directly correlate the alerts to the specific Network service instances.
Incident Management Applications: These AIOps applications use Artificial Intelligence and machine learning to automatically reduce alert volume, get proactive insights into the health of the infrastructure and network services, and collaborate quickly to resolve any incident that might arise. These applications also provide intelligent noise reductions. The alerts from Crosswork Health Insights sent to the AIOps are enriched to include the affected Network services. This correlation of the alerts to the network services allows MTTI in near-real time, further reducing the time required to fix the network services.
● Mean Time to Repair (MTTR)
Most of the time, if not all, the network operators know how to fix the network issues, once they have identified the network objects to be fixed. The time to fix the network becomes more efficient, if the user can invoke the remediation efficiently, this includes, minimizing input, previewing the fix, and most importantly rollback if the fix fails. This is possible using Crosswork Change Automation.
Cisco Crosswork Change Automation application automates the process of deploying changes to the network. Crosswork Change Automation does closed-loop automation using playbooks. Playbooks use Cisco NSO to configure the network to fix the issues related to the alerts. KPIs are mapped to playbooks, even before the KPI thresholds are exceeded. This helps to invoke the playbook easily if the threshold is exceeded. Invoking the playbook becomes efficient and fast, as it can derive the inputs to the playbook from the KPI alert, and playbooks leverage NSO preview and rollback functionality for remediation.
Figure 1 Shows the interactions of the Cisco Crosswork applications for closed-loop automation.
Let us look at an example. A layer 3 VPN is to be deployed across multiple routers, and if the network performance of the layer 3 VPN does not meet the requirements, the layer 3 VPN is to be rerouted. The following are defined to achieve automated closed-loop automation for the layer 3 VPN example.
● Cisco Crosswork Health Insights: To measure the network performance of the layer 3 VPN, two Internet Protocol Service Level Agreement (IP SLA) KPIs are defined, one for jitter and another for Round Trip Time (RTT). IP SLAs allows active monitoring and reporting of network traffic. The KPIs can be customized based on the requirements for:
◦ Cadence: How frequently the KPIs for IP SLA have to be collected.
◦ IP SLA Jitter: Monitors IP SLA UDP jitter, the interpacket delay variance.
◦ IP SLA Echo RTT: Monitors the packets round trip time.
● Cisco Network Service Orchestrator (NSO): A layer 3 VPN service model is modeled to configure it across the routers. This NSO service can also invoke Crosswork Health Insights using the REST APIs to enable the IP SLA KPIs.
● Cisco Crosswork Change Automation: A playbook is defined, which is invoked when the IP SLA thresholds are exceeded. The playbook will set the configuration such that the layer 3 VPN is rerouted. The playbook can be called directly from AIOps using REST APIs if the functionality is supported by the AIOps. Optionally the playbooks can be invoked from the KPI alert in Cisco Crosswork Health Insights.
Figure 2 shows the workflow of closed-loop automation for rerouting of the L3VPN if the IP SLA thresholds are exceeded.
Layer 3 VPN example for closed-loop automation
To learn more about Cisco Crosswork Automation, visit https://www.cisco.com/c/en/us/products/cloud-systems-management/crosswork-network-automation/index.html