Getting Started

This section explains the following topics:

Before you begin

We recommend that you familiarize yourself with the following concepts and complete any planning and information-gathering steps:

  • Crosswork Network Controller monitors services at two levels - Basic and Advanced.

    • Basic Monitoring: This type of monitoring offers the option of monitoring a higher number of services and provides limited sub-service metrics, resulting in lower resource consumption. Additionally, the graphic map renderings are smaller compared to more detailed monitoring.

    • Advanced Monitoring: This monitoring approach is supported for a fewer number of services, as it monitors a larger number of component sub-services and consumes more compute resources. Additionally, advanced monitoring results in an increased number of sub-service metrics and larger graphic map renderings.

    For more information, see Service Health scale information.

  • Crosswork Network Controller's Service Health supports single VM deployment and monitors devices at two levels - Basic and Advanced. The monitoring level details, shown above, also apply to Service Health single VM deployment.

    For more information, see Service Health single VM scale information.

  • For L2VPN services, Crosswork Network Controller monitors the overall health based on the subservices, while for L3VPN services, the monitoring occurs at the node level.

  • Crosswork Network Controller has implemented a rate-limiting process to manage service monitoring requests efficiently. This means that there may be a delay in publishing service monitoring requests if the number of requests raised per minute exceeds a specific threshold. The thresholds are defined as follows:

    • L2VPN services

      • 50 Basic Monitoring requests per minute per service

      • 5 Advanced Monitoring requests per minute per service

    • L3VPN services:

      • 500 Basic Monitoring requests per minute per vpn-node

      • 100 Advanced Monitoring requests per minute per vpn-node

    The rate-limiting process also extends to the monitoring data. For example, during a restore process, when all Data Gateways send data to the Tracker component, the rate at which the Tracker processes this data and forwards it to Assurance Graph Manager is regulated. This may lead to a delayed reporting of Events of Significance (EOS) following the restore.

    An event is triggered with a severity level of warning and a corresponding description to notify you of the delay. The event is cleared once Crosswork Network Controller resumes normal publishing of monitoring requests.

  • Crosswork Network Controller can store up to 50 GB of monitoring data. When storage usage reaches 70% of this capacity, it raises an alarm to alert you of potential storage depletion. If more storage is needed, you can configure external storage in the cloud using an Amazon Web Services (AWS) account. See Configure additional external storage.

  • Crosswork Network Controller uses a set of rules, expressed in low code format and saved in packages called heuristics packages to monitor the health of the services.

    • A Heuristic Package contains what to monitor, how to compute the monitored metrics, and symptoms associated with service health degradation. The overall health of the service is determined by applying the rules from the Heuristic Package.

    • The default heuristic packages provided with Crosswork Network Controller are referred to as system packages and cannot be altered. Crosswork Network Controller uses these system packages' predefined rules to deploy various testing probes, including Y.1731, TWAMP, SR-PM, and Provider Assurance Connectivity (formerly Accedian Skylight), to evaluate service health and determine whether the service complies with the Service Level Agreement (SLA) (applicable only to Provider Assurance Connectivity probes).

      If the default system packages do not fully meet your needs, you have the flexibility to customize them to better suit your specific requirements. You can create a custom heuristic package by exporting an existing package, modifying it, and then importing it back. See About Heuristic Packages.

  • Extended CLI support using Crosswork Network Controller's system device packages allows for more comprehensive service monitoring capabilities. These packages are capable of deriving exact sensor paths for metric health calculation, and can be installed as a bundle. To add or extend CLI-based KPI collections, you will need support from Cisco Professional Services. Engage with your Cisco account team for more details regarding this.

Getting started

Service Health is available as part of the Crosswork Network Controller Advantage Package (refer to the Get Started chapter in the Crosswork Network Controller 7.1 Installation guide).

You need a functional Crosswork Network Controller environment with devices onboard and services provisioned before you can start monitoring services. The following table includes links to the documents and processes necessary to accomplish those tasks as they are beyond the scope of this document.


Note


To set up and start monitoring services, follow Steps 1 through 6 in the table below. Steps 7 to 9 are optional and cover advanced use cases.


Workflow

Action

1. Install Crosswork Network Controller Advantage package.

See the Crosswork Network Controller 7.1 Installation guide.

2. Do the basic reachability checks from the Crosswork Network Controller UI.

See Setup Workflow in the Crosswork Network Controller 7.1 Administration guide.

3. Create and provision the required L2VPN and L3VPN services.

You can create and provision services using both the Crosswork Network Controller UI or using APIs:

4. Determine if you would like to configure additional external storage.

Note

 
You can configure external storage at any time.

If you anticipate monitoring health of many services, Cisco recommends configuring external storage after you install Service Health and before you begin monitoring the services.

See Workflow: Manage stored data.

5. Enable health monitoring for the provisioned services.

Start monitoring VPN services.

See Start Service Health monitoring.

6. Establish your operational processes for responding to degraded services.

Deep dive into the impacted services and subservices health, and drill down to the root cause of the service degradation.

See Workflow: Analyze the cause of service degradation.

7. (Optional) Use SR-PM to probe and monitor links and TE policies in the network.

Use SR-PM to measure performance metrics of TE policies and links.

See Workflow: Use SR-PM to monitor links and TE policies.

8. (Optional) Use Provider Connectivity Assurance to probe Service Health.

Using external probes from Provider Connectivity Assurance can provide additional insights into the health of the service.

Note

 

Provider Connectivity Assurance integration is available as a limited-availability feature in this release. Engage with your account team for more information.

See Workflow: Monitor Service Health using Cisco Provider Connectivity Assurance (formerly Accedian Skylight).

9. (Optional) Customize and import Heuristic Packages

Service Health offers a default set of Heuristic Packages for monitoring. If these packages do not fully meet your needs, you have the option to customize these packages to align with your specific requirements..

See Workflow: Customize Heuristic Packages.

Service Health monitoring workflows

This section outlines the procedures for different scenarios and functionalities identified in the Getting started section.

Workflow: Manage stored data

Crosswork Network Controller provides 50 GB of storage for monitoring data. If that limit is reached, the last recently used monitoring data will be deleted first.

When the storage exceeds 70% capacity, Crosswork Network Controller generates an alarm prompting you to configure external storage in order to save older monitoring data. The actions detailed in the section describe how to monitor storage usage, reduce the amount of data being stored and how to add additional external storage.

Table 1. Workflow: Manage stored data

Action

See

1. Reduce the number of services being monitored by stopping the monitoring for few services. Review the monitoring data that is already stored on your system and delete any data that you no longer need to free up storage space.

Stop Service Health monitoring

2. Switch services that are using Advanced Monitoring to Basic Monitoring to monitor the services in lesser detail.

Edit existing monitoring settings

3. If you still need additional storage, configure additional external storage on AWS Cloud.

Configure additional external storage

Workflow: Analyze the cause of service degradation

This is an operational workflow and it is iterative. Deep dive into the impacted services and sub-services health, and drill down to the root cause of the service degradation in any of the following ways:

Table 2. Analyze the cause of service degradation

Action

See

1. View monitored services and identify degraded services.

View monitored services

2. Identify cause of the service degradation.

3. Confirm if the reported degradation is a valid issue. In case it is not a valid issue, you may need to adjust the monitoring level (from Basic Monitoring to Advanced Monitoring or vice versa) to ensure accurate reporting of a service's health.

Alternatively, you can modify the system heuristic package to create a custom Heuristic Package to resolve the issue of false positive flagging of a service's health.

If the reported issue is valid, proceed to the next step in this workflow.

3. Analyze if the service degradation is on account of an issue with device health.

Workflow: Use SR-PM to monitor links and TE policies

To measure the performance metrics of links and TE policies, Crosswork Network Controller can leverage Segment Routing Performance Measurement (SR-PM). When this feature is enabled, Crosswork Network Controller gathers and processes additional metrics such as Delay, Delay Variance, and Liveness to compute the health and determine if any of the metrics have crossed the threshold compliance.

The following workflow describes how to enable SR-PM collection and view performance metrics collected using SR-PM.

Table 3. Workflow to view performance metrics collected using SR-PM

Action

See

1. Enable SR-PM metrics collection in Crosswork Network Controller.

Enable SR-PM metrics collection

2. View the performance metrics of the links and TE policies.

3. Ensure that the health of the policy or link is reported accurately without any false issues. If false reporting of degradation is observed, you can create a custom Heuristic Package by modifying the system heuristic package to provide customized and accurate health reporting.

About Heuristic Packages

Workflow: Monitor Service Health using Cisco Provider Connectivity Assurance (formerly Accedian Skylight)

Crosswork Network Controller can use external probing from Cisco Provider Connectivity Assurance (formerly Accedian Skylight) to measure performance metrics of the L3VPN services. These metrics are then compared with the contracted SLA (defined in the Heuristic Package), and the results are accessible on the UI for further analysis.

Note


Monitoring L3VPN services using Provider Connectivity Assurance is supported only with Advanced monitoring and requires a Provider Connectivity Assurance Essentials license. See Provider Connectivity Assurance Licensing Tiers for more information.


To add Provider Connectivity Assurance as a provider in Crosswork Network Controller, follow step 1 and 2 in the table. Follow step 3 to 6 iteratively for operational purposes.

Table 4. Probe and monitor Service Health using Cisco Provider Connectivity Assurance

Action

See

1. Install the Provider Connectivity Assurance Solution.

Refer to the Provider Connectivity Assurance Solution documentation and the Provider Connectivity Assurance installation guide for information on installing the Provider Connectivity Assurance solution and deploying it with Crosswork Network Controller.

Note

 

Sign up and create an account with the self sign-up tool to access the Provider Connectivity Assurance documentation.

2. Add Provider Connectivity Assurance as a provider in Crosswork Network Controller.

Add Provider Connectivity Assurance as a provider

3. Set up Probe sessions for the L3VPN service.

Monitor Service Health using Cisco Provider Connectivity Assurance (formerly Accedian Skylight)

4. View the metrics in the Crosswork Network Controller UI.

View probe session details

5. Analyze the cause of the service degradation.

Identify active symptoms and root causes of a degraded service

6. Confirm if the reported degradation is a valid issue. In case it is not a valid issue, you can modify the system Heuristic Package to create a custom Heuristic Package for a customized report of a service's health.

Workflow: Customize Heuristic Packages

Workflow: Customize Heuristic Packages

Crosswork Network Controller uses Heuristic Packages as the core logic to monitor and report the health of services. Heuristic Packages define a list of rules, configuration profiles, supported sub-services and associated metrics for every service type. Heuristic Packages provided by the system are read-only and cannot be modified.

If you find that the Heuristic Packages provided by the system do not meet your monitoring requirements, in terms of monitoring metrics or monitoring thresholds, you can create a customized Heuristic Package that caters to your specific monitoring requirements using the procedures in this workflow.


Note


Customizing Heuristic Packages is not included in the standard Day 2 support responsibilities. For assistance, please reach out to the Cisco account team or contact Cisco Professional Services.


Table 5. Customize Heuristic Packages

Action

See

1. Analyze your network services.

Check the system Heuristic Packages for rules, sub-services, and metrics to ensure that the system packages do not have the required metrics, services or thresholds already.

Determine the package that most closely matches the conditions you wish to identify in your network.

2. Export the package or packages that include the functions you wish to leverage.

About Heuristic Packages .

3. Using the supplied packages as your template build a new package that gathers the data you need to make determinations about the health of the service you want to monitor. In the simplest use case, you may simply need to edit the threshold points based on the SLAs used in your network. In more complicated use cases, you might need to build a Heuristic Package from scratch.

Build a custom Heuristic Package

4. Import the customized Heuristic Package in Crosswork Network Controller.

Import custom Heuristic Packages

5. Apply the custom package to all the services that should be using it.

6. Verify that the custom package is providing the monitoring data that you need to meet your requirements.

View monitored services

Service Health monitoring scale information

You can monitor a maximum of 52,000 services in total. This means you can monitor 52,000 services with only Basic Monitoring or a combination of Basic and Advanced Monitoring not exceeding 52,000 services in total with at most 2,000 services using Advanced Monitoring.

Type of monitoring Supports

Basic Monitoring

52,000 services

Advanced Monitoring

2,000 services


Note


For large L3VPN deployments, we support either Basic or Advanced monitoring for up to three large VPNs, with a maximum of 4,000 VPN nodes per deployment.


Service Health Monitoring single VM scale information

You can monitor a maximum of 2,200 services using Basic Monitoring and Advanced Monitoring, with 200 of those services using Advanced Monitoring. In addition, a single L3VPN (more than 200 nodes) service and 200 probe sessions for end-to-end monitoring is available.

For more information on Service Health Monitoring single VM support, see the Crosswork Network Controller 7.1 Administration guide.

Type of monitoring Supports

Basic Monitoring

2,000 services

Advanced Monitoring

200 services

L3VPN (up to 200 nodes)

1 service

Probe sessions for end-to-end monitoring

200 sessions