This document describes how to investigate the reason behind a Supervisor module or Distributed Forwarding Card (DFC) line card reboot associated with the %EARL-SP-2-PATCH_INVOCATION_LIMIT error.
This document is applicable to Catalyst 6500/Cisco 7600 platforms.
On Catalyst 6500/7600 most of the packets are forwarded purely in hardware through a series of ASICs and the forwarding engine.
In case an issue is detected between these components that might lead to an invalid packet forwarding, Cisco IOS® software triggers the Encoded Address Recognition Logic (EARL) recovery mechanism when it applies a patch. The patch resets elements that correspond (forwarding engine/ASICs) so that proper functionality of the device can be restored.
Per design, a reboot of the module is triggered when 10 consecutive EARL recovery patch attempts are executed within 30 seconds and do not fix the issue. Enter the show platform software earl reset config command from the SP in order to verify:
6500-sp#show platform software earl reset config
EBUS Out of seq. : Enabled
Earl freeze check. : Enabled
EARL Patch invocation limit per every 30 secs : 10
Upon reaching EARL patch invocation limit : Crash
When a module reboots unexpectedly there should be a crashinfo file generated and stored on the local flash file system.
This error can be generated by the Supervisor module:
%EARL-SP-2-PATCH_INVOCATION_LIMIT: 10 Recovery patch invocations in the last 30 secs
have been attempted. Max limit reached
%Software-forced reload
or by the DFC line card:
%EARL-DFC9-2-PATCH_INVOCATION_LIMIT: 10 Recovery patch invocations in the last 30 secs
have been attempted.
Max limit reached
%Software-forced reload
This message is shown in the crashinfo file. It indicates that the module rebooted because the EARL recovery patch was applied 10 times within 30 seconds with no success. The module reset is triggered in order to restore its proper functionality.
In order to verify a trigger of excessive patch invocations, you need to investigate the crashinfo file.
In this example, you can see how many times, when, and why the patch was requested:
Num. of times patch applied : 10
Num. of times patch requested : 11 <<<<<<<
AclDeny detection: (Total=12 Failed=1)
Time Reason InProgress Data
---------------------------------+----------------------+----------+------------
Jan 21 2014,05:52:57.281 GMT Earl Patch Limit Reach 0100 0
Jan 21 2014,05:52:57.281 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:56.905 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:54.677 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:53.625 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:52.773 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:51.661 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:51.257 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:50.321 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:48.709 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:47.933 GMT Tycho L2 mode L3 rst 0000 CAFE000C
Jan 21 2014,05:52:38.509 GMT Tycho L2 mode L3 rst 0000 CAFE000C
When "CAFE000C" is shown in the "Data" column you should further check the "show earl status" output available in the crashinfo file:
--------- show earl status --------
Adj. table interface block : Total interrupts - 11
AT_SEQ_ERR_INT : 0
AT_FOVR_INT : 0
AT_FUDR_INT : 0
AT_IB_ADJ_INT : 0
AT_BZONE_INT : 0
AT_CORR_ECC_ERR_INT : 0
AT_UNCORR_ECC_ERR_INT : 11 <<<<<<<
This means the EARL patch ran in an attempt to recover from the AT_UNCORR_ECC_ERR_INT error. This is an adjacency Error Correcting Code (ECC) error that indicates a hardware problem.
The next step is to reseat the module in the slot. If the errors are still present the module should be replaced.
Enter this command in order to verify the current status of the EARL mechanism on the Supervisor module:
# remote command switch show platform hardware earl status
In the case of a DFC line card issue, enter this command:
# remote command module [slot number] show platform hardware earl status
An exemplary output with the relevant section is shown in the next example. Notice that the AT_UNCORR_ECC_ERR_INT counter has a non-zero value, which validates module replacement:
6500# remote command switch show platform hardware earl status
<snip>
Adj. table interface block : Total interrupts - 2
AT_SEQ_ERR_INT : 0
AT_FOVR_INT : 0
AT_FUDR_INT : 0
AT_IB_ADJ_INT : 0
AT_BZONE_INT : 0
AT_CORR_ECC_ERR_INT : 0
AT_UNCORR_ECC_ERR_INT : 2
AT_ECC_ERR_DATA_CAPT : 1
If a different value is shown in the Data column in the crashinfo file, it is recommended to open a Cisco Technical Assistance Center (TAC) case and upload the show tech output along with relevant crashinfo file(s).
Field Notice 63743 might be applicable if the %EARL-xxx-2-PATCH_INVOCATION_LIMIT error is reported.
Revision | Publish Date | Comments |
---|---|---|
1.0 |
02-Mar-2015 |
Initial Release |