路由器 : 思科 12000 系列路由器

Cisco 12000 系列因特网路由器奇偶错误故障树

2015 年 8 月 28 日 - 机器翻译
其他版本: PDFpdf | 英语 (2015 年 4 月 22 日) | 反馈


目录


简介

在您遇到各种各样的奇偶错误消息后,本文解释步骤排除故障和隔离Cisco 12000SERIES互联网路由器的故障部分或组件。

注意: 本文不包括奇偶校验错误的原因。亦称如果是对奇偶校验错误感兴趣(单个事件翻倒的一个更加简明的定义- SEUs),并且他们的可能的原因,我们建议您读从提高网络可用性连接的本文。

开始使用前

规则

有关文档规则的详细信息,请参阅 Cisco 技术提示规则

先决条件

在继续进行本文前,我们建议您读以下文档:

使用的组件

本文档中的信息基于以下软件和硬件版本。

  • Cisco 12000 系列互联网路由器

  • 思科IOS�软件所有版本

本文档中的信息都是基于特定实验室环境中的设备创建的。本文档中使用的所有设备最初均采用原始(默认)配置。如果您是在真实网络上操作,请确保您在使用任何命令前已经了解其潜在影响。

概述

大多Cisco 12000SERIES互联网路由器路由处理器和线卡包括误码纠错(ECC)功能。有,然而,没有ECC功能的一些现有的线路卡在字段。ECC功能只包括RAM或同步动态RAM (SDRAM)内存在卡。其余没有由ECC保护。

这是ECC功能比较线卡的与Cisco 12000一起使用:

  • 所有引擎2及以后卡有ECC功能。

  • 引擎1卡更改对ECC在FCS以后。

  • 引擎0卡没有ECC功能。

  • 一些卡可以升级到集成ECC功能的相似的产品。

表下面的列表有ECC功能的产品:

非ECC产品 ECC产品
GRP (=) GRP-B (=)
GE-SX/LH-SC (=) GE-GBIC-SC-B (=)
GE-GBIC-SC-A (=) GE-GBIC-SC-B (=)
8FE-FX-SC(=) 8FE-FX-SC-B(=)
8FE-TX-RF45(=) 8FE-TX-RJ45-B(=)
6DS3-SMB(=) 6DS3-SMB-B(=)
12DS3-SBM(=) 12DS3-SMB-B(=)
OC12/SRP-IR-SC(=) OC12/SRP-IR-SC-B(=)
OC12/SRP-MM-SC(=) OC12/SRP-mm-SC-B(=)
OC12/SRP-LR-SC(=) OC12/SRP-LR-SC-B(=)

注意: - B和ECC独立。- B含义产品是板的第二个主要可订购的版本。有时,这是ECC的版本。

思科提供允许您升级非ECC板对一个新的ECC板的技术移植计划(TMP)。信用值将应用对新的ECC板的采购以交换非ECC板。

千兆路由处理器 (GRP) 奇偶检验误差树分析

下面的流程图帮助您确定Cisco 12000SERIES互联网路由器的哪个组件对奇偶校验/误码纠错(ECC)错误消息负责在千兆路由处理器(GRP)。

/image/gif/paws/29320/12000a_parity_error_fault_tree.gif

注意: 在奇偶/ECC错误事件期间,获取并且记录show tech-support输出并且控制日志,并且收集所有crashinfo文件

线路卡奇偶检验错误树分析

下面的流程图帮助您确定Cisco 12000SERIES互联网路由器线卡的哪个组件对奇偶校验/误码纠错(ECC)错误消息负责:

/image/gif/paws/29320/12000b_parity_error_fault_tree.gif

注意: 每当线卡体验奇偶/ECC错误事件,请收集同样多信息尽可能(请参阅在Cisco 12000SERIES互联网路由器的故障排除线路卡崩溃关于详细信息)。

Cisco 12000SERIES互联网路由器从在其他线路卡存储器的奇偶校验错误恢复(SDRAM和SRAM),无需失败。

Cisco 12000 系列千兆位路由处理器中的奇偶/ECC 错误

数据以错误奇偶校验可以由数所有读或写操作的parity-checking设备报告在Cisco 12000SERIES互联网路由器。

GRP-B和PRP使用单bit错误更正和多位错误检测ECC对共享内存(SDRAM)。自动地更正在SDRAM的个别位错误,并且系统继续运行作为正常。

Single-Bit Errors (SBEs)

PRP和GRP-B有支持ECC的增强版动态RAM (DRAM)控制器。所以,他们能更正一位错误和报告多位错误。一位错误的更正如下所示:

%Tiger-3-SBE: Single bit error detected and corrected at <address>

错误更正电路更正SBEs,并且不影响GRP-B或PRP的功能。除非他们频繁地,发生操作没有为一位错误要求。在那种情况下,替换处理器板是可行的。

多位错误 (MBE)

多位错误的检测通过总线错误异常或CPU缓存奇偶校验错误例外报告。

处理器内存奇偶校验错误 (PMPE)

处理器内存奇偶校验错误错误消息报告,如果CPU检测CPU内部缓存高速缓存奇偶校验错误,当访问处理器的外部缓存(在GRP的L3)时通过SysAD总线或者之一(L1或L2)。表1列出为缓存奇偶校验错误的每种类型将打印出消息的示例:

表 1:缓存奇偶校验错误位置

奇偶校验错误的位置 错误消息
L1说明缓存 Error:主要的, Instr缓存,字段:数据
L1数据缓存 Error:主要的,数据缓存,字段:数据
L2说明缓存 Error:SysAD, Instr缓存,字段:数据
L2数据缓存 Error:SysAD,数据缓存,字段:数据
L3说明缓存 Error:SysAD, Instr缓存,字段:第1 dword
L3数据缓存 Error:SysAD,数据缓存,字段:第1 dword

示例:

错误消息的第一行指示奇偶校验错误的位置,并且可以是在表列出的所有位置1。在本例中,位置是L3说明缓存。

Error: SysAD, instr cache, fields: data, 1st dword
Physical addr(21:3) 0x000000,
virtual addr 0x6040BF60, vAddr(14:12) 0x3000
virtual address corresponds to main:text, cache word 0  
           Low Data     High Data  Par  Low Data     High Data  Par
L1 Data:   0:0xAE620068 0x8C830000 0x00 1:0x50400001 0xAC600004 0x01          
           2:0xAC800000 0x00000000 0x02 3:0x1600000B 0x00000000 0x01           
           Low Data     High Data  Par  Low Data     High Data  Par
DRAM Data: 0:0xAE620068 0x8C830000 0x00 1:0x50400001 0xAC600004 0x01           
           2:0xAC800000 0x00000000 0x02 3:0x1600000B 0x00000000 0x01

show version的输出应该类似于此:

...System was restarted by processor memory parity error at PC 0x602310D0, 
address 0x0 at 03:18:21 GMT Sun Oct 27 2002 ...

show context输出,您能看到系统由缓存奇偶校验异常重新启动:

Router#show context slot 11
CRASH INFO: Slot 11, Index 1, Crash at 19:08:07 CST Thu Nov 14 2002

VERSION:
GS Software (GSR-P-M), Version 12.0(22)S1, EARLY DEPLOYMENT RELEASE SOFTWARE (fc1)
TAC Support: http://www.cisco.com/tac
Compiled Mon 16-Sep-02 17:36 by nmasa
Card Type: Route Processor, S/N

LC uptime was 0 minutes.
System exception: sig=20, code=0xE42F3E4B, context=0x52CF3D44
System restarted by a Cache Parity Exception
STACK TRACE:
-Traceback= 5020453C 500E5E24 5010E6DC 5015F89C 501E9F6C 501E9F58
...

在第二个故障之后替换GRP或PRP。

%GRP-3-PARITYERR 错误消息

下列信息在控制台输出中可能出现:

SEC 7: %GRP-3-PARITYERR: Parity error detected in the fabric buffers. Data (8)

此消息意味着奇偶校验错误由在GRP的矩阵接口硬件检测。六角形的编号指示错误中断矢量。这通常指示报告错误在GRP的一硬件故障(在这种情况下, slot 7)。在一个相似的问题的第二出现应该替换有故障GRP。

%PRP-3-SBE_DATA :Bad数据[hex] [hex] ECC rec [hex] calc [hex]

此错误消息显示,当路由器接收数据以错误奇偶校验。

数据以错误奇偶校验由数在Cisco 12000SERIES互联网路由器执行的所有读或写操作的parity-checking设备报告。

PRP使用单bit错误更正和多位错误检测ECC共享内存(SDRAM)。自动地更正在SDRAM的个别位错误,并且系统继续运行作为正常。

错误更正电路更正一位错误(SBE) (ECC)和不影响PRP的功能。除非他们频繁地,发生操作没有为一位错误要求。

如果错误频繁地发生,替换处理器板是可行的。

Cisco 12000 系列线路卡中的奇偶/ECC 错误

SDRAM ECC 错误

  • SDRAM一位纠错码(ECC)错误

    一位错误是不正确在从内存读的词的一位数据。对于SBEs,错误可以被更正,不用对操作的中断。

    一位错误检测,并且提交更正的数据。例如,一位错误报告如下在引擎4/4+ :

    SLOT 6:Jul 19 07:37:34: %TX192-3-SDRAM_SBE: Error=0x2 - DIMM1 Syndrome=0x7600 
    Addr=0xBEA09 Data bit80-Traceback= 401C8C9C 401C9508 401CDE08 401CDE40 4007F674 
    4009ED0C 4009ECF8

    错误更正电路更正SBEs,并且不影响线卡的功能。除非他们频繁地,发生操作没有为一位错误要求。在这种情况下,替换线卡是可行的。

  • SDRAM多位ECC错误

    多位错误是,当超过一个位是不正确在同一个词时。对于MBEs,错误检测和线路卡崩溃。SBEs和MBEs出现是非常少见的。

    这是消息的示例打印对控制台以回应在SDRAM的多位ECC错误:

    SLOT 5:Jul 25 16:58:51: %MCC192-3-SDRAM_SBE: Error=0x808 - DIMM0 
    Syndrome=0x31000000 Addr=0x81034 Data bit120
    -Traceback= 401C8C9C 401C9508 40450018 400BF7D4
    SLOT 5:Jul 25 16:58:51: %MCC192-3-SDRAM_MBE: Error=0x808 - DIMM0 
    Syndrome=0x18000000 Addr=0x80834
    -Traceback= 401C8D88 401C9508 40450018 400BF7D4

    MBEs不可能被ECC更正,并且造成线卡失败。线卡然后将重新加载并且带来回到正常操作由路由处理器。

    域诊断可以用于检查线路卡存储器MBEs。MBEs由域诊断检测作为内存错误。下面经历在TX SDRAM的一个多位错误失败域诊断板的示例:

    FDIAG_STAT_IN_PROGRESS(5): test #12 TX SDRAM Marching Pattern
    FD 5> RIM:
    FD 5> TX Registers
    FD 5> INT_CAUSE_REG = 0x00000680
    FD 5> Unexpected L3FE Interrupt occured.
    FD 5> ERROR: TX BMA Asic Interrupt Occured
    FD 5> *** 0-INT: External Interrupt ***
    FDIAG_STAT_DONE_FAIL(5) test_num 12, error_code 1
    Field Diagnostic: ****TEST FAILURE**** slot 5: last test run 12,
    TX SDRAM Marching Pattern, error 1
    Field Diag eeprom values: run 5 fail mode 1 (TEST FAILURE) slot 5
    last test failed was 12, error code 1

    如果有一QOC48或一OC192线卡,参考此问题信息通告(Field Notice) :QOC48/OC192 SBEs/MBEs。否则,您应该在第二个故障以后替换线卡。

缓存奇偶校验异常

检查 show context slot [slot#] 输出中 sig= 字段的值:

Router#show context slot 4
       CRASH INFO: Slot 4, Index 1, Crash at 04:28:56 EDT Tue Apr 20 1999
       
VERSION:
GS Software (GLC1-LC-M), Version 11.2(15)GS1a, EARLY DEPLOYMENT RELEASE
  SOFTWARE (fc1)
Compiled Mon 28-Dec-98 14:53 by tamb
Card Type: 1 Port Packet Over SONET OC-12c/STM-4c, S/N CAB020500AL
System exception: SIG=20, code=0xA414EF5A, 
context=0x40337424
System restarted by a Cache Parity Exception

当操作在非常详细的电压和温度条件时,根据引擎1转发引擎的一些卡是易受内部缓存损坏问题。

Cache Error Recovery Feature (CERF)是由冲洗的错误检测并且更正缓存奇偶校验错误从外部CPU缓存在引擎1线卡和刷新缓存线路的一个软件功能从DRAM。此功能提供在使CPU从缓存存储器奇偶校验错误恢复的CPU缓存管理算法的情报,防止线路卡崩溃,无需导致性能影响。

注意: 默认情况下CERF打开。此软件Error Correction Code (ECC)的活动可以由show controller cerf命令监控。要关闭功能,请使用全局配置命令没有服务cerf

请参阅问题信息通告(Field Notice) :在GSR其他信息的1GE卡德的缓存奇偶校验错误

要确定在哪转发引擎线卡根据,请参阅如何能我确定什么引擎卡在方框运行?从Cisco 12000SERIES互联网路由器:常见问题文档。

如果线卡根据引擎1,应急方案是升级Cisco IOS软件对包含Cache Error Recovery Feature (CERF)的版本。此功能是在Cisco IOS软件版本12.0(21)S3的第一联机。如果它由缓存奇偶校验异常仍然失败,则线卡需要替换。

如果线卡根据另一种引擎类型,您应该替换在相似的失败的第二出现的线卡。

基于引擎 0 的线路卡的错误消息

您可以发现在控制台日志的下列信息:

SLOT 2:Oct 23 17:07:45.531 EST: %LC-3-L3FEERRS: L3FE DRAM error 12 
address 41E9B9A0
SLOT 2:Oct 23 17:07:45.531 EST: %LC-3-L3FEERR: L3FE error: rxbma 0 addr 0 
txbma 0 addr 0 dram 12 addr 41E9B9A0 io 0 addr 0
SLOT 2:Oct 23 17:07:45.531 EST: %GSR-3-INTPROC: Process Traceback= 40080BAC
	-Traceback= 40357084 40495D30 40496EE0 400CCF98

此信息报告CPU DRAM写入奇偶校验错误。L3FE代表第三层转发引擎。应该替换线卡在第二次一相似的问题发生时。

基于引擎 1 的线路卡的错误消息

这是您可以遇到的一些错误消息:

  • 在一端口千兆位线卡的日志:

    SLOT 5: %LCGE-3-INTR: TX GigaTranslator external interface parity error
    

    对于更新的板,一个修正是替换TX GigaTranslator ASIC用现场可编程门阵列(FPGA)。在第二次一个相似的问题发生时,板应该替换。

  • 在控制台输出中:

    SLOT 6: %LC-3-ECC: Salsa ECC: About to handle ECC single bit error,
    ECC status = 2 DRAM error status = = 21
    SLOT 6: %LC-3-L3FEERR: L3FE error: rxbma 0 addr 0 txbma 0 addr 0 dram 21 
    addr 200020 io 0 addr 0
    SLOT 6: %LC-3-ECC: Salsa ECC: Addresses: Salsa returned =429BFDE8 correcting 
    on = 429BFDE8
    SLOT 6: %MEM_ECC-3-SBE: Single bit error detected and corrected at 0x429BFDE8
    SLOT 6: %MEM_ECC-3-SYNDROME_SBE: 8-bit Syndrome for the detected Single-bit error: 
    0x8A
    SLOT 4: %MEM_ECC-3-SBE_HARD: Single bit *hard* error detected at 0x6299FB60
    SLOT 1:Jun 10 05:29:47.690 EDT: %LC-3-ECC: Salsa ECC:  About to handle ECC single bit error,ECC status = 0 DRAM error status =12
    SLOT 6:Sep 26 15:18:01: %LC-3-SWECC: L2 event cleared: EPC = 0x40631CCC, CERR = 0xE40BB933, SysAD Addr = 1, total = 1
    SLOT 0:Dec  7 13:48:11.480: %LC-3-SWECC_DATA: L2 event cleared: EPC = 0x400A8040, CERR = 0xA01DCE58, l1v = 0x41E3C20441E3C1C5, dv =0x41E3C1C441E3C204, SysAD Addr = 0, total = 1
    

    这些消息可以拆分到以下零件:

    • %LC-3-ECC :萨尔萨ECC -有在线卡的L3FE ASIC的一个错误。

    • %LC-3-L3FEERR -有在线卡的L3FE ASIC reg的一个错误。信息。

    • %MEM_ECC-3-SBE -一个一位可校正错误在从DRAM读的检测。show memory ecc命令可以用于转存至今被记录的一位错误。这是相同的象%MEM_ECC-3-SBE_LIMIT错误消息。

    • %MEM_ECC-3-SYNDROME_SBE -检测的一位错误的8位综合症状。此值不指示位的确切的位置错误,然而可以使用接近他们的位置。这是相同的象%MEM_ECC-3-SYNDROME_SBE_LIMIT错误消息。

      基本上,线卡报告一位错误并且自动地更正了它。除非这频繁地,发生操作没有从您的部分要求。在这种情况下,替换线卡是可行的。

    • %LC-3-SWECC_DATA -表明缓存事件被更正了在LC在SLOT0被软件Error Correction Code (SWECC)。

  • 您也许遇到的另一个消息是:

    SLOT 4: %MEM_ECC-3-SBE_HARD: Single bit *hard* error detected at 
    0x6299FB60 

    此消息意味着一一位无法修复的错误[hard error]在从DRAM读的CPU检测。show memory ecc命令转存至今被记录的一位错误并且指示检测的硬错误地址位置。

    如果有这些错误,许多出现请监控系统使用show memory ecc命令并且更换DRAM。

基于引擎 2 的线路卡的错误消息

您在控制台输出中可以发现以下错误:

SLOT 6: %LC-6-PSAECC: An TLU SDRAM ECC correctable error occurred 
address 19C49FD
SLOT 2:035610: Feb 26 13:09:13.628 UTC: %LC-6-PSAECC: An PLU SDRAM ECC correctable error occurred address 1956059

这意味着Packet Switching ASIC (PSA) ECC保护的SDRAM识别一个可校正一位错误。除非这些消息频繁地,出现操作没有从您的部分要求。在这种情况下,替换线卡是可行的。

引擎3根据线卡错误消息

您在控制台输出中能看到这些错误:

SLOT 6:00:03:53: %PM622-3-SAR_SRAM_PARITY_ERR: (6/0): Parity error in Reassembly SAR SRAM address: 80000000.Resetting the port
SLOT 3:00:00:53: %PM622-3- SAR_MULTIBIT_ECC_ERR: (3/0): Multi-bit ECC Uncorrectable error in SAR SDRAM address: 80000000. Resseting the port.
SLOT 4:00:00:53: %PM622-3 SAR_SINGLE_BIT_ECC_ERR: (3/0): ECC corrected an error in SAR SDRAM address: 800000.
SLOT 0:Jun 25 20:45:53 KST: %EE48-6-ALPHAECC: RX ALPHA: An PLU SDRAM ECC correctable error occured address 1000C254
SLOT 0:Jun 25 20:45:53 KST: %EE48-6-ALPHAECC2: RX ALPHA: An PLU SDRAM ECC multibit error occured at address 1000E254
SLOT 5:Nov 17 09:46:30.171: %EE48-6-ALPHA_PARITY: TX ALPHA: Transient SRAM64 parity corrected error 3E Data  0 100000 Parity bits  0
SLOT 10:Feb 21 16:55:36: %EE48-3-ALPHA_SRAM64_ERR: TX ALPHA: ALPHA_PST_RANGE_ERR error 11003F Data  0 0 Parity bits  0
SLOT 4:Jan 15 06:30:00.942 UTC: %EE48-2-GULF_TX_SRAM_ERROR: ASIC GULF: TX SRAM uncorrectable error detected. Details=0x0000
SLOT 0:Mar 16 19:50:22.464 cst: %EE48-4-QM_ZBT_PARITY: ToFab Address 0xB95E Data 0x1
SLOT 5:May 17 06:17:35.507: %EE48-4-QM_NON_ZBT_PARITY: ToFab Error 0x10000028
SLOT 5:May 17 06:17:53.883: %EE48-4-QM_ZBT_PARITY_TRANSIENT: FrFab Address 0x0 Data 0x7E
SLOT 5:May 17 06:17:53.883: %EE48-4- GULF_RX_TB_PARITY_ERROR: ASIC GULF: RX telecom bus parity error on port 0
SLOT 1:Dec 13 00:27:42: %EE48-3-SRAM_PARITY: SRAM parity: Unable to find shadow 281B9EB4
SLOT 0:Aug  4 08:55:37: %EE48-3-QM_PARITY: FrFab Address 0x1859E Data 0x10
SLOT 0:Aug  4 08:55:37: %EE48-3-QM_ERROR: FrFab error register 0x80000.

基于引擎 4/4+ 的线路卡的错误消息

  • 您可以遇到在引擎基于4/4+的线卡的下列信息:

    SLOT 4: %RX192-3-HINTR: status = 0x4000000, mask = 0x3FFFFFFF - 
    Parity error on rx_pbc_mem.
    -Traceback= 401C37C0 403D8814 400BE1EC
    SLOT 4: %LC-3-ERR_INTR: Error interrupt occurred
    -Traceback= 400CE028 400C8DF0 40010A24
    

    SLOT 3: %RX192-3-HINTR: status = 0x4000000, mask = 0x3FFFFFFF - 
    Parity error on rx_pbc_mem.
    -Traceback= 406012E0 406972A0 400C555C
    %FIB-3-FIBDISABLE: Fatal error, slot 3: IPC failure
    

    SLOT 13:Dec  5 07:30:15.272 cst: %HERA-6-PAM_ACL_SBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C
    SLOT 2:00:03:41: %MCC192-6-RED_PARAM1_SBE: Parameter 1 - Single Bit Error detected and corrected 
    Syndrome = 0x7, Address = 0x43, samebit No, diffbit No
    SLOT 2:00:03:41: %MCC192-6-RED_PARAM2_SBE: Parameter 1 - Single Bit Error detected and corrected
    Syndrome = 0x7, Address = 0x43, samebit No, diffbit No
    SLOT 5:Apr 26 11:56:08.160: %MCC192-3-SDRAM_MBE: Error=0x200 - DIMM1 Syndrome=0x3000 Addr=0x811C3
    SLOT 10:Mar  6 05:05:26.965: %RX192-3-ADJ_MEM_MBE: phy addr 0x7905E648, offset 0xBCC9, old ecc 0x0, new ecc 0x0, bit -1, value 0x0 - MBE on Adjacency Memory..
    SLOT 13:Dec  5 07:30:15.272 cst: %HERA-6-PAM_ACL_MBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C
    SLOT 2:00:03:41: %MCC192-6-RED_PARAM1_MBE: Parameter 1 - Single Bit Error detected and corrected
    Syndrome = 0x7, Address = 0x43, samebit No, diffbit No
    SLOT 2:00:03:41: %MCC192-3-RED: Error=0x80000 - RED PARAM 1 ECC SBE Error.
    -Traceback= 405AF5E0 405B1CEC 406DFF7C 406E057C 400FC7E
    SLOT 2:00:03:41: %MCC192-6-RED_PARAM2_MBE: Parameter 1 - Single Bit Error detected and corrected
    Syndrome = 0x7, Address = 0x43, samebit No, diffbit No
    Sep  8 14:32:09 jst: %MEM_ECC-3-SYNDROME_SBE_LIMIT: 
    8-bit Syndrome for the detected Single-bit error: 0xD5
    

    此问题的症状包括:

    • 在此线卡的Cisco快速转发失效

    • 相关的端口坚持up/up

    • 线卡也许自动地重置

    如果线卡不重置,应急方案是执行microcode reload <slot>命令

    此消息总是不指示硬件问题用RX192模块。一些Cisco IOS软件Bug也许生成此错误消息作为副作用。如果此消息只一次出现,请继续监控板。设备将重置。如果问题持续,卡将自动地重置。如果此消息仍然存在,请与您的协助的Cisco技术支持人员联系。

  • SBE事件可以被检查E4/E4+用show controllers mcc192 ecc命令

    LC-Slot4#show controllers mcc192 ecc 
    MCC192 SDRAM ECC Counters
            SBE = 0x0,              MBE = 0x0
    TX192 SDRAM ECC Counters
            SBE = 0x0,              MBE = 0x0

    这报告关于RX和TX内存。

引擎5/5+根据线卡错误消息

您在控制台输出中能看到这些错误:

SLOT 1:Jun 26 20:45:53 KST: %EE192-6-WAHOOECC: RX WAHOO: An PLU SDRAM ECC correctable error occured address 20000254
SLOT 9:Sep 2 21:27:49.680 GMT+8: %MCC192-3-PKTMEM_SBE: Single bit error detected and corrected
SLOT 14:Jul 18 07:19:24.637:  RX_XBMA: 1-bit CPUIM_ECCERR1 error 0x2
SLOT 15:Jan  4 16:53:16.591:  TX_XBMA: (1) QSRAM qinfo SBE detected. info: 0x82605455
SLOT 12:Dec 12 22:34:15: %EE192-4-BM_ERRSSS: FrFab BM BADDR ECC ERR info single bit error(s) corrected, error 8250F63E count:  2
SLOT 1:Nov 22 13:40:02 JST: %EE192-3-QM_ERROR: RX_XBMA OQLLM error error register 0x1
-Traceback= 40AE71AC 406078C4 405F5EC0
SLOT 7:001113: Oct 24 10:50:28.520 BST: %EE192-3-WAHOOERRS: RX WAHOO: WAHOO_CSRAM_CNTRL_INT PIPE0 error 8
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRSSS: RX WAHOO: WAHOO_FFCRAM_CNTRL_INT PIPE0 error 4  addr 3FBFAB8  agent 94
SLOT 7:001114: Oct 24 10:50:28.520 BST: %EE192-3-WAHOOERRSSSS: RX WAHOO: WAHOO_PPC_INT PIPE1 error pl_ctl 4000226 pl_aa_avl F9F7B pl_aa_end 7FF9 pl_aa_fatal 4800000
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRS: RX WAHOO WAHOO_NFC_SRAM_MULTI_ECC_ERR multi-bit CSSRAM error 
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_CTCAM_CNTRL_INT multi-bit CSRAM error
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_FFCRAM_CNTRL_INT MBE
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRS: FSRAM not OK WAHOO_FSRAM_CNTRL_INT ECC_1_BIT_EE | ECC_UNCORR_EE
SLOT 6:Oct  4 16:48:00.487: %EE192-3-WAHOOERRS: WAHOO_CTCAM_CNTRL_INT multi-bit CSRAM error
SLOT 1:00:01:14: WEEKLY_THROTTLE_SOCKEYE_SBE: SOCKEYE SBE: addr: 0xC2A007C0, synd: 0xC4
SLOT 1:00:01:14: WEEKLY_THROTTLE_CBSRAM_SBE_TX+i: CBSRAM SBE TX: 1-bit CBSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_CBSRAM_SBE_RX+i: CBSRAM SBE RX: 1-bit CBSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_CSSRAM_SBE_TX+i: CSSRAM SBE TX: 1-bit CSSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_CSSRAM_SBE_RX+i: CSSRAM SBE RX: 1-bit CSSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_CSRAM_SBE_TX+i: CSRAM SBE TX: 1-bit CSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_CSRAM_SBE_RX+i: CSRAM SBE RX: 1-bit CSRAM error.
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FW_TCAM_PRTY_TX+throttle_i: TX FTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FW_TCAM_PRTY_RX+throttle_i: RX FTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_CL_TCAM_PRTY_TX+throttle_i: TX CLTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_CL_TCAM_PRTY_RX+throttle_i: RX CLTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_NF_TCAM_PRTY_TX+throttle_i: TX NFTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_NF_TCAM_PRTY_RX+throttle_i: RX NFTCAM PRTY error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_TCAM_PRTY_VMR: TCAM PRTY VMR error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_TCAM_PRTY_NO-VMR: TCAM PRTY NO-VMR error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_SBE_TX: FCRAM SBE TX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_SBE_RX: FCRAM SBE TX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FCRAM_PER_CHIP_SBE_TX: FCRAM CHIP SBE error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FCRAM_PER_CHIP_SBE_RX: FCRAM CHIP SBE error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FSRAM_SBE_TX: FSRAM SBE TX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_FSRAM_SBE_RX: FSRAM SBE RX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FSRAM_MBE_TX: FSRAM MBE RX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_W_ FSRAM_MBE_RX: FSRAM MBE RX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_ISERR_TX: ISERR TX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_ISERR_RX: ISERR RX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_FCRAM_SBE_TX: FCRAM SBE TX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_BM_FCRAM_SBE_RX: FCRAM SBE RX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_LINK_SBE_TX: QSRAM LINK SBE TX error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_LINK_SBE_RX: QSRAM LINK SBE RX error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_QEINFO_SBE_TX: QSRAM queue info sbe tx error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_QEINFO_SBE_TX: QSRAM queue info sbe rx error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_QSRAM_BADDR_SBE_TX: qsram bad addr sbe tx error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_ QM_QSRAM_BADDR_SBE_RX: qsram bad addr sbe rx error, status = 0x3
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_OQLLM_SBE_TX: oqllm sbe tx error, status = 0x2
SLOT 1:00:01:14: WEEKLY_THROTTLE_QM_OQLLM_SBE_RX: oqllm sbe rx error status = 0x3

引擎6根据线卡错误消息

您在控制台输出中能看到这些错误:

SLOT 0:Jan 14 08:53:44.581 GMT: %FIA-3-RAMECCERR: To Fabric ECC error was detected Single Bit Error RAM2 status = 0x8000  
Syndrome = 0x0 addr = 0x0
SLOT 6:Apr 29 09:36:12: %E6LC-4-ECC_THRESHOLD: HERMES VID SBE exceeded threshold, possible memory failure
SLOT 4:*Mar 13 23:38:19.295: %E6_RX192-3-MTRIE_SBE: Head1 Syndrome=0x94 Addr=0xFFF2B 
-Traceback= 40544830 40546A90 40688C94 400EDC18
SLOT 7:*Mar 4 1234:19.295: %E6_RX192-3-ADJ_SBE: Syndrome=0x59 Addr=0xFFF2B
-Traceback= 40000830 40036A90 40555D44 400ddd23
SLOT 14:Dec  9 20:02:29: %E6_RX192-6-PBC_SBE: Single bit error detected and corrected RLDRAM 
Syndrome=0x61 Addr=0xF855
Dec  9 20:02:33: %GRP-4-RSTSLOT: Resetting the card in the slot: 14,Event: linecard error report
SLOT 4:06:21:43: %E6_RX192-3-ACL_SBE: ACTION MEM Syndrome=0x7 Addr=0x0
-Traceback= 40549740 4054A7E0 4068D814 400EE018
SLOT 6:Mar 28 03:30:19: %RX192-3-HINTR: status = 0x1000000000000, mask = 0x7FFFFF0FA320F - L3X SBE error.
-Traceback= 405816DC 406A1010 406A1650 400F70E8
SLOT 6:Mar 28 03:30:19: %E6_RX192-6-VID_SBE: Single bit error detected and corrected VID memory Syndrome=0x19 Addr=0xE51B
SLOT 6:Nov 27 23:32:36: %HERA-3-PKTMEM_SBE: Single bit error detected and corrected Error=0x80 – 
Syndrome=0x5100000000000000 Addr=0x894620 Data bit116
SLOT 7:Oct 2 23:32:36: %HERA-6- MCD_SBE: Single bit error detected and corrected Error=0x50 – 
Syndrome=0x3100000000000000 Addr=0x331110 Data bit216
SLOT 1:Jun 22 03:32:36: %HERA-6- MRW_SBE: Single bit error detected and corrected Error=0x50 – 
Syndrome=0x3100000000000000 Addr=0x331110 Data bit216
SLOT 12:May 24 03:03:36: %HERA-6- UPF_SBE: Single bit error detected and corrected Error=0x60 – 
Syndrome=0x4100000000000000 Addr=0x451140 Data bit216
SLOT 13:Dec  5 07:30:15.272 cst: %HERA-6-PAM_ACL_SBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C
SLOT 9:May  5 18:52:14: %HERA-6-QM_FBF_SBE: Free Block FIFO - Single Bit Error detected and corrected 
Syndrom = 0x10, Addr = 0x778, samebit Yes, diffbit No
SLOT 9:May  5 18:52:14: %HERA-3-QM: Error=0x40 - FBF RAM ECC SBE.
-Traceback= 405AD4CC 405AF5D0 405F2E80 406DCDB8 406DD434 400FC500
SLOT 3:Aug 16 00:45:14: %MCC192-6-RED_AQD_SBE: Average Queue Depth - Single Bit Error detected and corrected 
Syndrome = 0x7, Address = 0x89, samebit No, diffbit No
SLOT 2:Jan 23 06:29:56 KST: %MCC192-6-RED_STAT_SBE: Statistics - Single Bit Error detected and corrected 
Syndrome = 0x38, Address = 0xFF, samebit No, diffbit No
SLOT 4:*Mar 13 23:38:19.295: %E6_RX192-3-MTRIE_MBE: Single bit error detected and corrected Head1 
Syndrome=0x94 Addr=0xFFF2B
SLOT 7:*Mar 4 1234:19.295: %E6_RX192-3-ADJ_MBE: Syndrome=0x59 Addr=0xFFF2B
-Traceback= 40000830 40036A90 40555D44 400ddd23
00:00:18: %E6_RX192-3-PBC_MBE: ADJ OBANK LO Syndrome=0xE5 Addr=0x142
-Traceback= 405BF8B0 405C0F08 406E8D78 406E93B8 400FCCE0
SLOT 6:Mar 28 03:30:19: %E6_RX192-6-VID_MBE: Single bit error detected and corrected VID memory Syndrome=0x19 Addr=0xE51B
SLOT 0:Apr 18 06:44:53.751 GMT: %HERA-3-PKTMEM_MBE: Error=0x1010 - Syndrome=0x9900000000
SLOT 7:Oct 2 23:32:36: %HERA-6- MCD_MBE: Single bit error detected and corrected Error=0x50 – 
Syndrome=0x3100000000000000 Addr=0x331110 Data bit216
SLOT 1:Jun 22 03:32:36: %HERA-6- MRW_MBE: Single bit error detected and corrected Error=0x50 - Syndrome=0x3100000000000000 Addr=0x331110 Data bit216
SLOT 13:Dec  5 07:30:15.272 cst: %HERA-6-PAM_ACL_MBE: PKT CNT MEM Syndrome=0x8 Addr=0x523C
SLOT 9:May  5 18:52:14: %HERA-6-QM_FBF_MBE: Free Block FIFO - Single Bit Error detected and corrected 
Syndrome = 0x10, Addr = 0x778, samebit Yes, diffbit No
SLOT 3:Aug 16 00:45:14: %MCC192-6-RED_AQD_MBE: Average Queue Depth - Single Bit Error detected and corrected 
Syndrome = 0x7, Address = 0x89, samebit No, diffbit No
SLOT 2:Jan 23 06:29:56 KST: %MCC192-6-RED_STAT_MBE: Statistics - Single Bit Error detected and corrected 
Syndrome = 0x38, Address = 0xFF, samebit No, diffbit No

SPA错误消息

您在控制台输出中能看到这些错误:

SLOT 7:Jan 4 02:04:00.487: %SPA_CHOC_DSX-3-UNCOR_PARITY_ERR:  SPA4/0: CHOC SPA parity error(s) encountered
SLOT 7:Jan 4 02:04:00.487: %MCT1E1-3-UNCOR_PARITY_ERR:  SPA5/0: T1E1 SPA parity error(s) encountered
SLOT 3: 00:33:48: %MCT1E1-3-UNCOR_MEM_ERR: SPA3/0: 1 uncorrectable HDLC SRAM memory error(s) encountered.
SLOT 1:Oct  3 14:42:45.727: %SPA_PLIM-4-SBE_ECC: SPA-4XT3/E3[1/2] reports 2 SBE occurrence at 1 addresses
SLOT 1: Jul 22 05:26:29.613 UTC: %SPA_DATABUS-3-SPI4_SINGLE_DIP4_PARITY: SIP Sbslt 0 Ingress Sink - A single DIP4 parity error has occurred on the data bus.
SLOT 4: Dec  2 22:44:05: %SPA_DATABUS-3-SPI4_SINGLE_DIP2_PARITY: SIP Sbslt 0 Egress Source - A single DIP 2 parity error on the FIFO status bus has occurred.
SLOT 1:Oct  3 14:42:45.727: %SPA_PLIM-4-SBE_OVERFLOW: SPA-4XT3/E3[1/2] reports SBE table (2 elements) overflows
SLOT 1:Oct  3 14:42:45.727: % SPA_PLUGIN-3-SPI4_SETCB: SPA-4XT3/E3[1/2] : IPC SPI4 set callback failed(status 2).

Cisco 12000 系列交换矩阵卡中的奇偶检验误差

与交换矩阵卡涉及的所有奇偶错误消息在Cisco 12000SERIES互联网路由器的硬件故障排除详细报道。这些消息包括(不详尽的列表) :

%FABRIC-3-PARITYERR: To Fabric parity error was detected. Grant parity error 
Data = 0x2.

SLOT 1:%FABRIC-3-PARITYERR: To Fabric parity error was detected. 

Grant parity error Data = 0x1

相关的思科支持社区讨论

思科支持社区是您提问、解答问题、分享建议以及与工作伙伴协作的论坛。


相关信息


Document ID: 29320