Disaster Recovery

This chapter describes the disaster recovery process and the health check feature.

Overview

There are two partitions in NCS 1010: RP SSD (CPU partition) and chassis SSD (Disaster Recovery partition). The Disaster Recovery partition contains all the backup configurations such as ISO images, RPMs, and system configuration files. When the node is corrupted, the Disaster Recovery feature allows the CPU to be replaced with the existing configuration. After replacing the CPU, the node reboots and comes up by restoring the software and configuration files from the chassis SSD without traffic loss.

CPU Replacement Considerations

You must consider the following points for CPU replacement.

  • When the CPU is removed from the chassis, NCS 1010 chassis runs in headless mode which is non-traffic impacting.

  • When the CPU is replaced with another CPU having the same software and RPMs as in the chassis SSD, the configuration is restored from the chassis SSD.

  • When the CPU is replaced with another CPU having different software and RPMs as in the chassis SSD, the Disaster recovery process starts. In this case, the node boots with the software from the chassis SSD and the configuration is also restored from the chassis SSD.

Health Check of Backup ISO Image

The Health Check feature ensures error-free booting of NCS 1010 chassis during disaster recovery operations. NCS 1010 has a partition for disaster recovery where the backup ISO image is stored. The backup ISO image is stored in the chassis SSD.

The chassis SSD content is audited against the running software by the install process in the background every 12 hours to detect corruption. If the ISO image is corrupted, the software will recover it by copying from the backup location. If the software fails to synchronize with the chassis SSD, then the Disaster Recovery ISO Image Corruption alarm is raised. See the Troubleshooting Guide for Cisco NCS 1010 to clear the alarm.

Automated File Management System

Table 1. Feature History

Feature Name

Release Information

Feature Description

Automated File Management System

Cisco IOS XR Release 24.4.1

The new Automated File Management System is designed for efficient file handling on each node. This system automatically archives older files and removes them from local nodes to free up valuable SSD space. It manages the following types of files:

  • System-generated log files

  • Showtech-related residual files

The automated file management system archives older files to free up valuable SSD space by deleting them from the local nodes.

Types of files

The SSD stores two types of files that are generated by

  • User: creates and owns files for requirement purposes and deletes the files when are no longer needed.

  • System: organizes files automatically based on the file content and the application that created the content such as .log and showtech-related residual files.

    Automated file management is applicable for the system-generated files.

How the automated file management system works

These stages describe how the automated file management works for various files.

Log files

The NCS 1014 system uses the log rotation configurations to manage log files as required.

  1. The system checks for the .log files exceeding 10 MB file size.


    Note


    This threshold is applicable for /tmp folder files only. For files in other folders, the system uses a different threshold and follows the same process.


  2. After locating the file, the system

    1. archives that file with .gz extension, and

    2. creates a new .log file with the same name.

  3. The system then maintains one active log file and two archived files.

  4. When new files are archived, the system deletes the oldest archived file to make sure that only the two most recently archived files and the current log file are retained.​

showtech-related residual log files

  1. The system generates the residual files during the collection of logs when Showtech logs is in operation.

  2. After collecting the logs, the system automatically removes these residual files to conserve space.​