Alarm Troubleshooting

For information about alarms and clearing procedures, see the Alarm Troubleshooting chapter in the following guides:

This chapter provides a description, severity, and troubleshooting procedure for each Cisco Optical Network Controller alarm and condition. To clear an alarm when it is raised, refer to its clearing procedure.

BACKUP-FAILURE

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: Controller system-Database Backup

The BACKUP-FAILURE alarm is raised when Cisco Optical Network Controller system database backup file creation fails.

Clear the BACKUP-FAILURE Alarm

To clear this alarm:

Procedure


Step 1

Wait until the next succesful On-demand backup or scheduled backup, the backup failure alarm is cleared.

Step 2

If backup keeps failing, check the PostGRES database pod health using kubectl get pods -n nxf-system | grep postgres in the CLI interface by accessing the VM using SSH.

If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


NODE-BACKUP-FAILURE

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: Node-Database Backup

The NODE-BACKUP-FAILURE alarm is raised when Node backup file creation fails in Cisco Optical Site Manager.

Clear the NODE-BACKUP-FAILURE Alarm

To clear this alarm:

Procedure


Step 1

To clear this alarm:

Step 2

Wait until the next succesful On-demand backup or scheduled backup, the node backup failure alarm is cleared.

Step 3

If backup keeps failing, troubleshoot the backup creation in Cisco Optical Site Manager

If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


NODE-DISCONNECT

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: NODE: {Node-name}

The NODE-DISCONNECT alarm is raised when Cisco Optical Network Controller is unable to connect to a node.

Clear the NODE-DISCONNECT Alarm

To clear this alarm:

Procedure


Step 1

Check the connectivity to the node that got disconnected and fix any network issues.

Step 2

If the node went down, bring it back up.

Step 3

Check the node configuration, in the Nodes app and ensure the username and password are correct.

If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


UPLOAD-FAILURE

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: Controller system-Database Backup

The UPLOAD-FAILURE alarm is raised when Cisco Optical Network Controller system Database Backup File Upload to SFTP fails.

Clear the UPLOAD-FAILURE Alarm

To clear this alarm:

Procedure


Step 1

Wait until the next succesful upload, the upload failure alarm is cleared.

Step 2

If the uploads keep failing, check the network connectivity to the SFTP server and fix any connectivity issues.

If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


DISK THRESHOLD

Default Severity: Critical (CR), Non-Service-Affecting (NSA)

Logical Object: DISK: mount:/data

The DISK-THRESHOLD alarm is raised when Cisco Optical Network Controller disk usage exceeds the threshold. When free space is less than 30% this alarm is raised.


Note


This alarm was introduced in Cisco Optical Network Controller Release 25.1.2.


Clear the DISK-THRESHOLD Alarm

To clear this alarm:

Procedure


Step 1

Pause PM Jobs from the CONC UI

  1. Go to the PM History app.

  2. In the Summary tab, select the desired job and click the Edit button.

    Note

     

    Only one job can be edited at a time.

    A popup window appears.
  3. Use the toggle button at the top-left corner to disable the job.

    This action pauses the job and prevents further PM data collection.

Step 2

Remove user created files from the /data directory. Remove all additional files or directories under /data directory.

Only the following folders are expected in the /data directory.

drwx------  2 root        root            16384 Jun 30 12:35 lost+found
drwxr-xr-x 14 root        root             4096 Jun 30 12:35 containerd
drwxr-xr-x 34 root        root             4096 Jun 30 12:39 local-path-provisioner
drwxr-xr-x  3 kube-system kube-system      4096 Jul 25 05:36 etcd
drwxr-xr-x  2 kube-system kube-system      4096 Jul 30 08:05 promtail

Step 3

Delete ISO Files from Local SFTP.

  1. SSH into the CONC VM.

  2. Get the list of all ISO files on the VM.

    sedo object-store list onc-sw-iso
    
  3. Delete an iso file from the list using the following command.

    sedo object-store bucket delete onc-sw-iso/<file-name>
    

Step 4

Download and Remove Archive Files from CONC UI

  1. Go to the Logs app.

    Under the Archives tab, there is the list of all available archive files.
  2. Click Download in the Action column to save the file locally.

  3. Click Delete in the Action column to delete an archive file.

Step 5

When the freespace is more than 30%, the alarm gets cleared within 5 minutes.

If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


SWITCHOVER

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: CLUSTER

The SWITCHOVER alarm is raised when Cisco Optical Network Controller CLUSTER switchover has occurred.

Clear the SWITCHOVER Alarm

This alarm clears automatically once the switchover is complete and database replication is complete.

Procedure


If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).


FAILOVER

Default Severity: Major (MJ), Non-Service-Affecting (NSA)

Logical Object: CLUSTER

The FAILOVER alarm is raised when Cisco Optical Network Controller CLUSTER failover has occurred.

Clear the FAILOVER Alarm

This alarm clears automatically once the failover is complete and database replication is complete.

Procedure


If the alarm does not clear, log into the Technical Support Website at https://www.cisco.com/c/en/us/support/index.html for more information or call Cisco TAC (1 800 553-2447).