Overview
Objective
Detect anomalies and generate alerts that can be used to notify an operator or trigger automation workflows.
Challenge
Discovering and repairing problems in the network usually involves manual network operator intervention and is time-consuming and error-prone.
Solution
Incorporating Change Automation and Health Insights into Crosswork Network Controller allows service providers to automate the process of discovering and remediating problems in the network by enabling an operator to match an alarm to pre-defined remediation tasks. These tasks will be performed after a defined Key Performance Indicator (KPI) threshold has been breached. Remediation can be implemented with or without the network operator’s approval, depending on the operator's setting and preferences.
Using such closed-loop remediation reduces the time to discover and repair a problem while minimizing the risk of making a mistake and creating an additional error through high-stakes manual network operator intervention.
How does it work?
Smart monitoring
-
The Smart Monitoring feature helps operators collect, filter, and present the data in useable formats, such as graphs and tables. Operators can concentrate on their business goals while Crosswork Network Controller, Change Automation, and Health Insights handle the configuration required for the data collection using the zero-touch telemetry feature.
-
By utilizing a common collector to gather network device data over SNMP, CLI, and model-driven telemetry and making it available as modeled data described in YANG, we can prevent duplicate data collection. This optimizes the load on both the devices and the network.
-
The Recommendation Engine analyzes network device hardware and software configuration and employs a pre-trained model built from data mining. It produces KPI-relevant recommendations, facilitating per-use-case monitoring.
-
KPIs cover a wide range of statistics, from CPU, memory, disk, and layer 1/2/3 network counters to per protocol, LPTS, and ASIC statistics.
Smart filtering
-
Health Insights builds dynamic detection and analytics modules that allow operators to monitor and see alerts on network events based on user-defined logic (KPI).
-
Key Performance Indicators (KPIs) Alerting Logic can be:
-
Simple static thresholds (TCA). For example, CPU load above 90 percent.
-
Moving average, standard deviation, percentile-based, etc., For example, CPU load above mean and staying there for five minutes.
-
Streaming jobs that provide real-time alerts or batch jobs that run periodically.
-
Customized for threshold values and visualization dashboards.
-
Customized operator-created KPIs based on business logic.
-
TCAs that can be exported or integrated with other systems via HTTP, Slack, and socket interfaces.
-
-
KPIs are associated with dashboards that provide real-time and historical views of the data and corresponding TCAs.
-
KPIs also provide purpose-built dashboards that go beyond raw data and provide valuable information in various infographic style charts and graphs useful for triaging and root-causing complex issues.
Smart remediation
-
Health Insights KPIs can be associated with Change Automation playbooks, which can be executed manually or via auto-remediation. Remediation workflow could be used to fix the issue or collect more data from the network devices. Operators can save time and money by proactively remediating the situation instead of resorting to ad hoc debugging and unscheduled downtime, providing better QOE to their customers.
-
Health Insights correlates alerts or anomalies on the network's topology, allowing easy visualization of the impact of events.
Feedback