简介
本文档介绍在端口退回后,ASR 5000上与链路聚合(LAG)端口的单点故障(SPOF)警报相关的问题。当实际上没有问题时,虚假警报可能会导致不必要的票证打开。
受影响的产品
任何ASR 5000(包括LAG端口)都会受到影响。
症状
在ASR 5000平台上,存在与基于LAG的10 GB线卡(XGLC)的单点故障(SPOF)警报的不必要触发相关的问题。每当LAG端口关闭(陷阱PortDown)时,CardSPOFClear陷阱就会触发,而当端口打开(陷阱PortUp)时,CardSPOFAlarm陷阱就会触发。端口退回可能出于多种原因,包括PSC迁移、npumgr重新启动、硬件故障、机箱重新加载或外部引起的链路问题。此代码段显示端口19/1退回的各个SPOF陷阱,同时,LAG切换通常会导致进程中可能退回的所有端口的陷阱。
Tue Jan 21 07:35:55 2014 Internal trap notification 1024 (PortDown) card 19 port 1 port type 10G Ethernet
Tue Jan 21 07:35:55 2014 Internal trap notification 1503 (EntStateOperDisabled) Port(19/1) Admin state:"Locked", Alarm severity:"Major"
Tue Jan 21 07:35:55 2014 Internal trap notification 93 (CardStandby) card 19 type 10 Gig Ethernet Line Card
Tue Jan 21 07:35:55 2014 Internal trap notification 140 (CardSPOFClear) card 19 type 10 Gig Ethernet Line Card
Tue Jan 21 07:40:36 2014 Internal trap notification 1025 (PortUp) card 19 port 1 port type 10G Ethernet
Tue Jan 21 07:40:51 2014 Internal trap notification 139 (CardSPOFAlarm) card 19 type 10 Gig Ethernet Line Card
从2015年1月部署的v15.0开始,除了SNMP陷阱,警报机制也开始收到通知。 以下是示例中的匹配警报:
******** show alarm outstanding verbose *******
Severity Object Timestamp Alarm ID
-------- ---------- ---------------------------------- ---------------------
Alarm Details
--------------------------------------------------------------------------------
Minor Card 19 Tuesday January 21 07:40:51 5769809167128920064
插槽19中的10千兆以太网线卡是单点故障。插槽20中需要10千兆以太网线卡。
解决方案
根据根本原因分析中所述的原因,应简单忽略并清除LAG配置卡的SPOF警报。clear alarm命令可用于清除所有未处理的警报(如果需要,包括非SPOF警报),或通过指定show alarm outstanding [verbose]报告的警报ID,仅清除特定SPOF。对于上述示例:
clear alarm id 5769809167128920064
或
clear alarm all
注意:除非发生另一端口退回,否则警报将无限期地保留,在这种情况下,新警报(如时间戳所示)将取代现有警报。
根本原因分析
由于LAG的设计,卡冗余由LAG完成,而不是在卡级别完成,因此所有LAG配置的卡始终处于活动运行状态 — 它们均不处于备用状态。因此,LAG配置卡的配置不指定任何冗余。
show port info
...
Card 23: card 26:
Card Type: 10 Gig Ethernet Line Card Card Type: 10 Gig Ethernet Line Card
Operational State : Active Operational State : Active
Redundant With : None Redundant With : None
******** show card table all ********
Slot Card Type Oper State SPOF Attach
---------- ---------------------------------------- ------------- ---- ------
19: LC 10 Gig Ethernet Line Card Active Yes 3
20: LC 10 Gig Ethernet Line Card Active Yes 4
21: LC 1000 Ethernet Line Card Active No 5
22: LC 1000 Ethernet Line Card Active No 6
23: LC 10 Gig Ethernet Line Card Active Yes 7
24: SPIO Switch Processor I/O Card Active No 8
25: SPIO Switch Processor I/O Card Active No 8
26: LC 10 Gig Ethernet Line Card Active Yes 10
27: LC 10 Gig Ethernet Line Card Active Yes 11
28: LC 10 Gig Ethernet Line Card Active Yes 12
29: LC 10 Gig Ethernet Line Card Active Yes 13
30: LC 10 Gig Ethernet Line Card Active Yes 14
同时,非LAG卡的配置会指定冗余。例如,以下是没有任何LAG端口的配置,在这种情况下,SPOF警报具有重要性,应进行调查。这是显示各对主用/备用XGLC的卡表。
card 19
redundant with 20
#exit
card 23
redundant with 26
#exit
card 27
redundant with 28
#exit
card 29
redundant with 30
#exit
[local]ASR5000> show card table all
Slot Card Type Oper State SPOF Attach
----------- -------------------------------------- ------------- ---- ------
...
19: LC 10 Gig Ethernet Line Card Active No 3
20: LC 10 Gig Ethernet Line Card Standby - 4
21: LC 1000 Ethernet Line Card Active No 5
22: LC 1000 Ethernet Line Card Active No 6
23: LC 10 Gig Ethernet Line Card Active No 7
24: SPIO Switch Processor I/O Card Active No 8
25: SPIO Switch Processor I/O Card Active No 8
26: LC 10 Gig Ethernet Line Card Standby - 10
27: LC 10 Gig Ethernet Line Card Active No 11
28: LC 10 Gig Ethernet Line Card Standby - 12
29: LC 10 Gig Ethernet Line Card Active No 13
30: LC 10 Gig Ethernet Line Card Standby - 14