Table Of Contents
Alarms
Severity Level
Critical Alarms
Major Alarms
Minor Alarms
Alarms
This chapter lists the Cisco Internet Streamer CDS Release 2.4 alarms. Each alarm is followed by an explanation and recommended action.
Severity Level
An alarm can have one of the following three severity levels: critical, major, or minor:
•
A critical alarm indicates that a critical problem exists somewhere in the network. Critical alarms cause failover and should be cleared immediately.
•
A major alarm indicates that a serious problem exists that is disrupting service. Major alarms differ from critical alarms in that they do not cause failovers. Major alarms should also be cleared immediately.
•
Minor alarms should be noted and cleared as soon as possible.
Critical Alarms
Alarm 330001 (svcdisabled) - service name - service has been disabled.
Explanation The Node Manager tried restarting the specified service but the service kept restarting.
The number of restarts has exceeded an internal limit and the service has been disabled.
Alarm 330002 (servicedead) - service name - service died.
Explanation A critical service has died. Attempts are made to restart this service, but the device may
run in a degraded state.
Recommended Action The device could reboot itself to avoid instability. Examine the syslog for
messages relating to the cause of service death.
Alarm 335000 (alarm_overload) Alarm Overload State has been entered.
Explanation The Node Health Manager issues this alarm to indicate that the device is raising alarms
at a rate that exceeds the overload threshold.
Recommended Action Access the device and determine what services are raising the alarms. Take
corrective action to resolve the individual services' issues.
Alarm 335001 (keepalive) Keepalive failure for - application name - . Timeout = n seconds.
Explanation An application is not being responsive, indicating it may not be properly operating.
Recommended Action Access the device and determine what state the specific application is in.
Alarm 335003 (test1) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. this alarm should never occur during normal operation.
Alarm 335006 (test4) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 335008 (test1) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 445002 (disk_smartfailcrit) An SE disk has severe early-prediction failure which requires immediate action.
Explanation The System Monitor issues this alarm to indicate that one of the disks attached to the
SE has severe early-prediction failure (for example, the disk has failed SMART self-check).
Recommended Action Back up data immediately on the disk to prevent data loss, and replace the disk
after it is marked bad by the SE.
Alarm 445005 (disk_softraidcrit) A SoftRAID device has malfunctioned and requires immediate action.
Explanation The System Monitor issues this alarm to indicate that a SoftRAID device has
malfunctioned (for example, both component disks of a RAID-1 array have become inaccessible or
faulty).
Recommended Action Replace the disks and restore the data from backup storage, or remanufacturing
and reload the disks.
Major Alarms
Alarm 100002 (ManifestFetchFail) Fail to fetch manifest file for Delivery Service.
Explanation There is a problem fetching the manifest file for this delivery service.
Recommended Action Log in to the Content Acquirer, execute the show stat acq err command to
check the problem, and resolve the problem.
Alarm 100003 (ManifestParseFail) Fail to parse manifest file for Delivery Service.
Explanation There are some syntax errors in the manifest file for this delivery service.
Recommended Action Log in to the Content Acquirer, execute the show stat acq err command to
check the problem, and resolve the problem.
Alarm 100005 (ExceedQuota) Total content size could not fit into the Delivery Service disk quota.
Explanation The total content size acquired for this delivery service is larger than allowed from the
delivery service disk quota.
Recommended Action Either remove some contents from the manifest file, or increase the delivery
service disk quota.
Alarm 100006 (CrawlStartUrlFail) The start-url for a crawl job in the Delivery Service failed.
Explanation There is a problem fetching the start URL of a crawl job in this delivery service.
Recommended Action Log in to the Content Acquirer, and execute the show status acquirer error
command to check the problem, and resolve the problem.
Alarm 100007 (ContentFail) There are some contents that failed to be acquired.
Explanation There are some contents that failed to be acquired.
Recommended Action Log in to the Content Acquirer, execute the show status acquirer error
command to check the problem, and resolve the problem.
Alarm 213501 (svcnomcastenable) Alarm multicast is disabled although the SE is a multicast sender and receiver, or it is subscribed to a multicast Delivery Service.
Explanation The unicast data receiver issues this alarm to indicate that the device does not have
multicast service enabled, although it is expected to be involved in multicast distribution.
Recommended Action Enable the multicast license and service on the device.
Alarm 330003 (servicedead) - service name - service died.
Explanation The node manager found the specified service to be dead. Attempts are made to restart
this service.
Recommended Action Examine the syslog for messages relating to the cause of service death. The
alarm is cleared if the service stays alive and does not restart soon.
Alarm 335002 (test) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 335004 (test2) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 335009 (test2) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 335010 (test3) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 445001 (core_dump) A User Core file or Kernel Crash dump has been generated.
Explanation The System Monitor issues this alarm to indicate that one or more of the software
modules or the kernel has generated core files.
Recommended Action Access the device and check the directory /local1/core_dir, or /local1/crash,
retrieve the core file through FTP, and contact Cisco TAC.
Alarm 445003 (disk_smartfailmajor) An SE disk has early-prediction failure.
Explanation The System Monitor issues this alarm to indicate that one of the disks attached to the
SE has early-prediction failure. This alarm indicates the disk could fail in the near future.
Recommended Action Make proper preparations for the incoming disk drive failure, such as making
data backups and preparing a replacement disk.
Alarm 520004 (GroupDown) - group - Specified standby group is down.
Explanation None of the member interfaces' in the specified standby group could be brought up.
Recommended Action Check the member interfaces configuration and cabling.
Alarm 540002 (linkdown) Network interface is inactive or down.
Explanation The network interface is inactive or down.
Recommended Action Check the cables connected to the network device.
Alarm 661001 (svclowdisk) Alarm database is running low in disk space in the STATEFS partition.
Explanation The database monitor service issues this alarm to indicate that it is running low in disk
space in the STATEFS partition, and therefore content replication service (acquisition and
distribution) has been temporarily stopped.
Recommended Action Execute the cms database maintenance command or schedule database
maintenance more frequently to reclaim the disk space.
Alarm 700002 (cms_clock_alarm) The device clock is not synchronized with the primary CDSM. Enabling NTP on all the devices is strongly recommended.
Explanation If this device is an SE, its clock must be synchronized with the primary CDSM to make
replication status, statistics monitoring, and program files work correctly. If this device is a standby
CDSM, its clock must be synchronized with the primary CDSM to make the CDSM failover work.
Recommended Action Fix the clock on either this device or the primary CDSM.
Minor Alarms
Alarm 100001 (zerobandwidth) specified content acquisition bandwidth is 0.
Explanation The device has been assigned as Content Acquirer for some delivery services, but its
acquisition bandwidth is 0.
Recommended Action On the CDSM, Devices page, select this device and select Edit and the Select
Preposition link on the left of the screen, then change its default bandwidth.
Alarm 100004 (ManifestUpdateFail) Fail to recheck manifest file for Delivery Service.
Explanation There is a problem rechecking the manifest file for this delivery service.
Recommended Action Log in to the Content Acquirer, execute the show status acquirer error
command to check the problem, and resolve the problem.
Alarm 100008 (ContentUpdateFail) There are some contents that failed to be rechecked.
Explanation There are some contents that failed to be rechecked.
Recommended Action Log in to the Content Acquirer, and execute the show status acquirer error
command to check the problem, and resolve the problem.
Alarm 100009 (ManifestParseWarning) Fail to parse manifest file for Delivery Service.
Explanation There are some syntax warnings in the manifest file for this delivery service.
Recommended Action Log in to the Content Acquirer, and execute the show status acquirer error
command to display the warnings, and resolve the problem.
Alarm 212500 (svcbwclosed) Alarm Dout bandwidth is set to zero while jobs are scheduled.
Explanation The unicast data sender issued this alarm to indicate that the Dout is scheduled to be
zero, but currently the unicast data sender has a job running.
Recommended Action Access the CDSM and determine if the bandwidth values and bandwidth
schedules are correctly configured, and verify on the device the effective bandwidth and job
statistics.
Alarm 213500 (svcbwclosed) Alarm Din bandwidth is set to zero while jobs are scheduled.
Explanation The unicast data receiver issued this alarm to indicate that the Din is scheduled to be
zero, but currently the unicast data receiver has a job scheduled or running.
Recommended Action Access the CDSM and determine if the bandwidth values and bandwidth
schedules are correctly configured, and verify on the device the effective bandwidth and job
statistics.
Alarm 213502 (svcnomcastconnectivity) There is no multicast network connectivity between the multicast sender and this device.
The unicast data receiver issues this alarm to indicate that the device as multicast receiver cannot receive Pragmatic General Multicast packets from a multicast sender. There is no multicast network connectivity between the multicast sender and this device.
Recommended Action Check and fix the multicast network connectivity between the sender and the
receiver.
Alarm 213503 (svcunsspaceproblem) There is a unified name space problem while replicating and so some NACKs are suppressed.
Explanation The unicast data receiver issues this alarm to indicate that the device as multicast
receiver cannot receive files due to a problem with UNS. It stops sending NACKs for the UNS failed
files.
Recommended Action Check and fix the UNS-related issues in the multicast receiver SE.
Alarm 213504 (svcnacksuppressed) Alarm that Multicast Receiver has stopped NACKs due to heavy loss.
Explanation The unicast data receiver issues this alarm to indicate that the device as multicast
receiver cannot receive multicast files for some considerable time and has stopped sending NACKs
for the files.
Recommended Action Check the multicast network for any problems. The sending of NACKs starts
after at least one file is successfully received.
Alarm 215003 (svcdevfailover) Alarm backup multicast sender has been activated.
Explanation The backup multicast sender issues this alarm to indicate that it has been activated and
either the primary backup sender has a problem, or the primary and backup multicast senders cannot
communicate with each other due to possible network connection issues.
Recommended Action Troubleshoot the multicast sender service on the primary multicast sender and
check the network connectivity between the primary and backup multicast senders.
Alarm 215500 (svcbwclosed) Alarm Mout bandwidth is set to zero while jobs are scheduled.
Explanation The multicast data sender issues this alarm to indicate that the device has Mout
scheduled to be zero, but currently the multicast data sender has a job scheduled or is running.
Recommended Action Access the CDSM and determine if the bandwidth values and bandwidth
schedules are correctly configured, and verify on the device the effective bandwidth and job
statistics.
Alarm 330004 (servicedead) - service name - service died.
Explanation The node manager found the specified service to be dead. Attempts are made to restart
this service.
Recommended Action Examine the syslog for messages relating to the cause of service death. The
alarm is cleared if the service stays alive and does not restart in a short while.
Alarm 335005 (test3) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 335007 (test5) NHM Alarm Testing [string].
Explanation This alarm is used for testing the Node Health Manager.
Recommended Action None. This alarm should never occur during normal operation.
Alarm 400000 (wesvcthresholdexceeded) WebEngine has reached service threshold limits.
Explanation WebEngine service has reached license limits, or the limits were configured with the
webengine max-concurrent-sessions command.
Recommended Action Avoid further service requests to this device.
Alarm 445000 (disk_failure) An SE disk has failed.
Explanation The System Monitor issues this alarm to indicate that one of the disks attached to the
SE is not responding.
Recommended Action Access the device and execute the show disk details command. If the problem
persists, replace the disk.
Recommended Action Watch the disk for early indication of errors. If more severe SMART errors or
disk errors appear, take action accordingly.
Alarm 445005 (disk_softraidcrit) A SoftRAID device has malfunctioned and requires immediate action.
Explanation The System Monitor issues this alarm to indicate that a SoftRAID device has
malfunctioned (for example, f both component disks of a RAID-1 array have become inaccessible
or faulty).
Recommended Action Replace the disks and restore data from backup storage, or remanufacture and
reload the disks.
Alarm 445006 (disk_softraidminor) A SoftRAID device has become degraded and requires immediate action.
Explanation The System Monitor issues this alarm to indicate that a SoftRAID device has become
degraded (for example, one disk of a RAID-1 array has become inaccessible or faulty).
Recommended Action Ensure there is a current data backup, replace the faulty disk, and then
reconstruct the RAID array.
Alarm 511010 (svcthresholdexceeded) WMT has reached service threshold limits.
Explanation Windows media technologies service has reached license limits, or the limits are
configured with the wmt max-concurrent-sessions bandwidth wmt outgoing command.
Recommended Action Avoid further service requests to this device.
Alarm 511011 (fmsthresholdexceeded) FMS has reached service threshold limits.
Explanation Flash Media Streaming service has reached concurrent connection limits.
Recommended Action Avoid further service requests to this device, or contact Cisco TAC for more
connection licenses.
Alarm 511012 (mssvcthresholdexceeded) Movie Streamer has reached service threshold limits.
Explanation Movie Streamer service has reach license limits, or the limits are configured.
Recommended Action Avoid further service requests to the device.
Alarm 520001 (LinkDown) -group-ifc-slot-port- Specified interface in the standby group is down.
Explanation The specified interface in the standby group is down. There could have been a link
failure on the interface or it may have been shut down on purpose.
Recommended Action Check the configuration and cabling of the specified interface.
Alarm 520002 (RouteDown) -group-ifc-slot-port- Unable to reach the configured default gateway on the specified interface.
Explanation Unable to reach the configured default gateway on the specified interface in the standby
group.
Recommended Action Check the network configuration on the specified interface.
Alarm 520003 (MaxError) -group-ifc-slot-port- The specified interface has seen errors exceeding maximum allowable error count.
Explanation The specified interface has seen errors exceeding the maximum allowable error count.
Recommended Action Check the cabling or configuration of the specified interface.
Alarm 540001 (shutdown) Network interface is shutdown.
Explanation The network interface is shut down.
Recommended Action Check the interface configuration.
Alarm 700001 (cms_test_alarm) CMS test alarm with instance value - instance was raised. The title is used in the CDSM GUI.
Explanation This is a test alarm defined and used in CMS code. This alarm is identified by a tuple
(340001, instance). This means the system may have several raised alarms with the 340001 ID
having different instance values. Instance is usually used to link an alarm to a particular data item
(such as a particular failed disk, or a delivery service having A&D troubles).
Recommended Action Advise the user how to handle this raised alarm. This is shown in the CDSM
GUI or command-line interface (CLI).
Recommended Action Restart the Remote execution agent by using the CLI.
Alarm 1000010 (ManifestEmptyContent) Parsed Manifest file does not have any items to process.
Explanation There are no single or crawl items mentioned in the manifest file to process.
Recommended Action Edit the manifest file of this delivery service to have one or more items to
process.