Handling Zombies
Telemetry – Traffic Collector (TM-TC) service is implemented using nano-services and Reactive FASTMAP design pattern.
There are two nano-plans in TM-TC:
-
External user facing plan: This plan provides an interface for tracking the configuration status of each node.
-
Internal hidden plan: This plan applies TM-TC service configuration to a node. The internal service is created for each device by a stacked service.
Zombies are the internal operational data model in NSO to store deleted service data. Zombies are helpful when performing staged deletions and RFM (RFM is the NSO version of eventual consistency). When a service deletion is triggered, NSO maintains references of the deleted services (zombies) in operational data. The zombies are deleted from the configuration database (CDB) when all the configurations for the service are removed from the devices. Zombies inform the data interface the progress of a service deletion. It also informs the stage it is waiting on, which helps to point to the problematic area. For more information, see NSO documentation in Cisco DevNet.
On Cisco Crosswork Change Automation and Health Insights, when you trigger a deletion to clean up the configuration on a device (DLM ADMIN_DOWN / UNMANAGED / DELETION), depending on the connectivity of the device, deleting the configuration at once may lock down the database until the time the last configuration is removed. Once the configuration is successfully removed from the device, the TM-TC service will update the nano-plan state to communicate the deletion progress to the data interface. After the deletion process is completed, TM-TC service removes the nano-plan, zombies, and all the service-related operational data from the CDB.
In some scenarios, as mentioned below, the zombies may not be deleted even after deleting the device configuration and may require manual intervention to delete the configuration references from the devices. In such cases, run the cleanup action on the device/service. Device and service are inter-usable terms in this context as Cisco Crosswork Change Automation and Health Insights creates services per device.
-
Device is not reachable during deletion.
-
Device is reachable, but the configuration removal fails on the device for other reasons.
If a device/service goes into the zombie state, user should delete the existing plan to enable any new telemetry collection on the device. If the data interface (Crosswork) or a CLI/NETCONF user tries to recreate the service instance before the zombie/delete is fully processed, the following error is displayed, which indicates that the deletion process is still in progress.
Aborted: Operation failed because: Service still in zombie state: 'YYY'
Note |
TM-TC Funtion Pack does not support zombie resurrect and redeploy options. |
The below image shows how to check if a service is in zombie state on NSO.
The below image shows the message displayed when you try to create a new configuration on a service that is in zombie state (viewed in the
page).The below image shows the NSO cleanup command to remove the plan in zombie state.