Guest

Cisco IOS Software Releases 12.1 Mainline

Troubleshooting Cisco IOS Software Scheduler-Related Error Messages

Document ID: 12422

Updated: Jun 24, 2008

   Print

Introduction

This document explains the causes of some Cisco IOS® software scheduler-related error messages, and how to troubleshoot them. These messages are not related to a specific platform. They can appear on every platform that supports Cisco IOS software.

These are the messages that this document covers:

If you encounter a "SCHED..." error message which is not explained on this page, use the feedback form at the top of this page in order to inform Cisco.

Prerequisites

Requirements

There are no specific requirements for this document.

Components Used

This document is not restricted to specific software and hardware versions.

Conventions

Refer to the Cisco Technical Tips Conventions for more information on document conventions.

Background Information

The Cisco IOS software scheduler, which is part of the Cisco IOS software Kernel, manages all the processes in the system using a series of process queues that represent each process state. The queues hold context information for processes in that state. Processes transition from one state to another as the scheduler moves their context from one process queue to another. Some of the process queues are:

  • Idle queue—Contains processes that are still active but wait on an event to occur before they run.

  • Dead queue—Contains processes that have terminated, but need to have their resources reclaimed before they can be totally removed from the system.

  • Ready queues—Contains processes that are eligible to run. There are four ready queues, one for each process priority. When a running process suspends, the scheduler regains control of the CPU and uses an algorithm to select the next process from one of its four ready queues.

Troubleshoot

SCHED-3-STUCKMTMR

A process can register to be notified when various events occur in the router. This specific message appears whenever a registered timer expires and the timer value is unchanged after the process executes two successive times. This is always a cosmetic software-related issue.

These messages on the console indicate such a problem:

%SCHED-3-STUCKMTMR: Sleep with expired managed timer 1C7410, 
time 0x1063F9C52 (00:00:00 ago).
-Process= "IP SNMP", ipl= 6, pid= 44
-Traceback= 31BC79A 31BC9C0 323E130

The process in which this error message occurs is a good indication for narrowing down the cause of these tracebacks. This list shows the more common reasons for these messages to appear:

  • IP Simple Network Management Protocol (SNMP) Process—This message can appear during SNMP WriteNet request:

    %SCHED-3-STUCKMTMR: Sleep w/expired mgd timer 13AF58, 
    time 0xBDBE878A (00:00:03 ago).
    -Process= "IP SNMP", ipl= 6, pid= 29
    -Traceback= 313B218 313B5D2 3192A76 319EFEC 319F234 30FF17E 319F446 319F88E 30FEA70 
    3304C1E 33045F0 32F78E4 32F82AE 32F383E 32F7ABA 30FF19A
    %SYS-4-SNMP_WRITENET: SNMP WriteNet request. Writing current configuration to 
    146.61.55.230.
    %SYS-4-SNMP_WRITENET: SNMP WriteNet request. Writing current configuration to 
    146.61.10.20.

    Earlier Cisco IOS software releases contained some IP SNMP poll-related problems. An upgrade to the latest Cisco IOS Software Release 12.0 or 12.1 main release solves this issue. This is a cosmetic message, and there are no adverse side-effects which might affect the operation of the router (or the IP SNMP process).

  • Virtual Integrated Network Service (VINES) Protocols Process—These tracebacks can be generated on a router configured for VINES:

    %SCHED-3-STUCKMTMR: Sleep w/expired mgd timer 6100606C, time 0x222DF318 
    (00:00:00 ago).
    -Process= "VINES Protocols", ipl= 6, pid= 60

    The message(s) occur(s) randomly and do(es) not appear to affect VINES performance. They occur if VINES has missed processing an expired timed event (when the system processor is heavily loaded). The event eventually gets processed, but not when it first expires.

    VINES uses timers for processing and handling VINES Address Resolution Protocol (ARP) services, Inter Processor Communication (IPC) sessions and retransmission, route aging, and some server services.

    These messages have been fixed in the Cisco IOS Software Release 12.0S and 12.1 main releases.

  • Multi Protocol Label Switching (MPLS)-related Process—These tracebacks can be generated on a router configured for MPLS:

    %SCHED-3-STUCKMTMR: Sleep w/expired mgd timer 60C0E9B4, time 0x3952 
    (00:00:00 ago).
    -Process= "TDP Hello", ipl= 5, pid= 58
    -Traceback= 600867F0 60086BB8 604390D4 60077E88 60077E74
    
    %SCHED-3-STUCKMTMR: Sleep w/expired mgd timer 60CC2548, time 0x43006 
    (00:00:00 ago).
    -Process= "Tag Control", ipl= 5, pid= 56
    -Traceback= 600867F0 60086BB8 60448320 604484F0 60077E88 60077E74

    Analysis of the event loops for the Tag Distribution Protocol (TDP), TDP Hello, and Tag control processes shows that the loops could call a specific process_wait_for_event process without processing all expired timers. The loops are fixed to ensure that all expired timers are processed before suspending. This issue is solved in the latest Cisco IOS Software Release 12.0S and 12.1 main releases.

This list of processes where this message can occur is non-exhaustive. It is always a cosmetic message and, therefore, does not justify a Cisco IOS software upgrade. Be sure to run the latest Cisco IOS software release in your train. If the message still appears in the latest Cisco IOS software release which is available on Cisco.com for registered users, contact Cisco Technical Support to open a case. At this time, provide a complete show log with the error messages and a show tech of the router or switch on which the problem occurs.

SCHED-3-THRASHING

This message means that the indicated process has relinquished control 50 consecutive times and there are still outstanding events to be processed.

These messages on the console indicate such a problem:

%SCHED-3-THRASHING: Process thrashing on watched queue 
'ARP queue' (count 54).
-Process= "ARP Input", ipl= 5, pid= 6
-Traceback= 6020589C 60205BC4 60236520 601F4FD8 601F4FC4

These thrashing checks are intended to determine if a process is, for some reason, does not do its job. The thrashing check on watched queues (which is the troublesome message which is signaling) checks the number of elements on the queue. If this number remains the same for a given number of schedulings, the message is printed.

Some queues are length-limited. This means that if the router gets very busy, the queues always stay at the maximum. As a result, the thrashing code in the scheduler gets confused and thinks that these queues have not been handled. The thrashing code has determined that the process which was supposed to handle the queue was not doing its job and prints the thrashing message.

The scheduler has been changed in later Cisco IOS software code. In order to keep track of whether the queues have been changed (so it can better determine whether or not the process is thrashing), the scheduler now notes whenever an item is removed from the queue, and only prints the thrashing message if nothing gets removed for a while.

Most of the time, the queue thrashing message is cosmetic.

These messages are not always caused by a software bug. They can be issued in response to either instantaneous or sustained demand on the router. Increased or persistent messages can indicate that the traffic load needs to be reviewed.

Note: These code changes are reported under Cisco bug ID CSCdj68470 (registered customers only) .

SCHED-3-UNEXPECTEDEVENT

This message appears whenever a process receives an event that it does not know how to handle. For example:

%SCHED-3-UNEXPECTEDEVENT: Process received unknown event (maj 10, min 0).
-Process= "IP SNMP", ipl= 0, pid= 23
-Traceback= 602842B8 6017CFB8 6017CFA4

There are several possible causes of this problem:

  • The most likely cause is that one process directly wakes up another process, and passes major and minor event numbers to the process. If the sending process wakes up the wrong process, the receiving process does not know how to handle the received major and minor event numbers. The process might perform the wrong action if it expects an event with matching major and minor event numbers, or it might print this message. Use the output of the show process command to help determine which process(es) might have sent a direct wakeup to a process.

  • Another possible cause of this problem is that a development engineer has added code to register for an event, but has not added the code to handle the event.

  • A subroutine called by the process may have registered for a new event, but has not deregistered the event before it exits.

These messages are always due to a software bug. Based on the process that did not know how to handle an event, you can run into different bugs in the Cisco IOS software.

If the process is equal to either Exec or Virtual Exec, you are most likely to run into these issues:

%SCHED-3-UNEXPECTEDEVENT: Process received unknown event (maj 80, min 0).
-Process= "Exec", ipl= 0, pid= 20
-Traceback= 604A0D68 6049B400 6049C974 601B2F5C 601B338C 601CC384 601CC9E0 601F5628 
602383EC 602383D8

or

%SCHED-3-UNEXPECTEDEVENT: Process received unknown event (maj 80, min 0).
-Process= "Virtual Exec", ipl= 0, pid= 2
-Traceback= 60479FA0 60474638 60476474 601B0E20 601B0A38 601E5088 601E5B08 601F0A54 
60231324 60231310

This error message is caused by debug code that was accidentally left in some older versions of code. It has reappeared in the Cisco IOS Software 12.0 mainline release. The error message is likely to occur if you have TACACS configured and you execute the show line command on the command line interface (CLI) of the router. The error message has no affect on the functionality of the router, so this can be considered as a cosmetic bug. The only way to get rid of this error message is to upgrade the Cisco IOS software to a later release.

You must run at least Cisco IOS Software Releases 12.0(11), 12.0(11)S, or 12.1(2), based on the train that you run. However, if you are faced with another bug, consider an upgrade to the latest Cisco IOS software available for the corresponding train. If the problem is still present in the latest Cisco IOS software release, you can contact Cisco Technical Support to open a new bug. At this time, have ready the complete output of the show logging command with the error messages and the output from the show version in order to decode the tracebacks.

Refer to Cisco bug ID CSCdp17107 (registered customers only) for further information on this issue.

SCHED-2-WATCH

This message displays whenever an attempt is made to register for an event without first creating the data structure for that event. This is an internal software bug in the Cisco IOS Software. The output looks something like this:

%SCHED-2-WATCH: Attempt to enqueue uninitialized watched queue (address 0).
-Process= "Net Input", ipl= 0, pid= 29
-Traceback= 601B821C 60193428 604F59EC 604F6110 601C09F8 601934E0 6019304C 
  601A65E8 601A65D4

You can encounter this kind of error message during an Online Insertion and Removal (OIR) of any type of card. For instance, on a Cisco 12000 Series Internet router, you can see these messages after you replace a Gigabit Route Processor (GRP) card in a GSR12016 series router:

%SCHED-2-WATCH: Attempt to set uninitialized watched boolean (address 0).
-Process= "LC Crash Complete Process", ipl= 0, pid= 29
-Traceback= 60189CA8 60244E08 6017562C 60175618

Earlier versions of code contain some redundancy issues. Most of these problems are fixed in the latest Cisco IOS Software Release 12.0S. Be sure to run a Cisco IOS software release which is later than or at least equal to Cisco IOS Software Releases12.0(18)S1 and 12.0(17)S2. A cold reload of the router should most likely fix this issue if a reseat of the faulty card does not work.

The messages are similar to this output on a 7500 Series Router:

%OIR-6-REMCARD: Card removed from slot 3, interfaces disabled
%SCHED-2-WATCH: Attempt to set uninitialized watched Boolean (address 0).
-Process= "OIR Handler", ipl= 0, pid= 7
-Traceback= 60236120 60C64838 60280594 60280874 602211BC 602211A8

Most of the time these SCHED error messages are due to an internal software bug in the Cisco IOS software. Therefore, the first step in troubleshooting these error messages is to look for a known bug.

An upgrade to the latest Cisco IOS software image in your release train gets rid of all fixed Cisco IOS software scheduler-related bugs.

If the problem still appears, contact your Cisco support representative with an exact copy of the error message, along with the output from a show tech-support and a show log command.

Information to Collect if You Open a Cisco Technical Support Case

If you still need assistance after you follow the troubleshooting steps in this document, you can open a case (registered customers only) with Cisco Technical Support. Be sure to include the information listed here:
  • Console captures that show the error messages.
  • Console captures that show the steps you took to troubleshoot the problem and the boot sequence during each step.
  • The hardware component that failed and the serial number for the chassis.
  • Troubleshooting logs.
  • Output from the show technical-support command.
Attach the collected data to your case in non-zipped, plain text format (.txt). You can upload information to your case with the TAC Service Request Tool (registered customers only) . If you cannot access the Case Query tool, you can send the information in an E-mail attachment to attach@cisco.com. Include your case number in the subject line of your message to attach the relevant information to your case.

Note: Do not manually reload or power-cycle the router before you collect this information, unless required. This can cause you to lose important information that you need in order to determine the root cause of the problem.

Related Information

Updated: Jun 24, 2008
Document ID: 12422