当端口沿着走由于UDLD错误时, 本文解释如何排除故障,当连结9000 TCAM
它包括当前和普通的概念、故障排除方法和错误信息。
当端口沿着走由于UDLD错误时,本文的目的将帮助用户知道如何排除TCAM故障
Prerequsite
对Cisco NXOS命令的了解
NXOS TCAM配置
拓扑
问题能在简单拓扑看到
(N9k-1)Eth2/1-2 — — — — — — — — — — — (N9k-2) Eth2/1-2
1.1.1.1 /24 1.1.1.2/24
排除故障
工作的以下协议失败在控制层面:
ARP解决方法失败
在连结9000的端口报告在于下模块的1 & 2. UDLD错误。
N9K-1(config-if)# 2018 Oct 20 07:23:23 N9K-1 %ETHPORT-5-IF_ADMIN_UP: Interface port-channel100 is admin up . 2018 Oct 20 07:23:23 N9K-1 %ETHPORT-5-IF_DOWN_PORT_CHANNEL_MEMBERS_DOWN: Interface port-channel100 is down (No operational members) 2018 Oct 20 07:23:23 N9K-1 last message repeated 1 time 2018 Oct 20 07:23:23 N9K-1 %ETHPORT-5-IF_DOWN_ERROR_DISABLED: Interface Ethernet2/2 is down (Error disabled. Reason:UDLD empty echo) 2018 Oct 20 07:23:23 N9K-1 last message repeated 1 time 2018 Oct 20 07:23:23 N9K-1 %ETHPORT-5-IF_DOWN_ERROR_DISABLED: Interface Ethernet2/1 is down (Error disabled. Reason:UDLD empty echo) sh 2018 Oct 20 07:23:25 N9K-1 last message repeated 1 time
线卡发生故障由于对机箱的L2ACLRedirect 诊断测试模块的1 & 2。
'Show module' Mod Online Diag Status --- ------------------1 Fail————————————cleared the module 1 and 2 error .[show logging nvram] 2 Fail—————————————module 2 reloaded. 3 Pass Module 1 and 2: 11) L2ACLRedirect-----------------> E 12) BootupPortLoopback: U
另一位方式用户能击中此状态是从T2 ASIC基本机箱的SUP/LC被移动向Tahoe基本机箱
注意: 如果要知道关于请排除故障的ASIC的更多信息请与Cisco TAC联系
升级 从T2的CSCvc36411 到Tahoe根据线卡/FM能导致诊断的故障和TCAM问题
分析
当TCAM值设置到0在N9K-2,此问题将被看到
N9K-2 # sh hardware access-list tcam region NAT ACL[nat] size = 0 Ingress PACL [ing-ifacl] size = 0 VACL [vacl] size = 0 Ingress RACL [ing-racl] size = 0 Ingress RBACL [ing-rbacl] size = 0 Ingress L2 QOS [ing-l2-qos] size = 0 Ingress L3/VLAN QOS [ing-l3-vlan-qos] size = 0 Ingress SUP [ing-sup] size = 0 Ingress L2 SPAN filter [ing-l2-span-filter] size = Ingress L3 SPAN filter [ing-l3-span-filter] size = 0 Ingress FSTAT [ing-fstat] size = 0 span [span] size = 0 Egress RACL [egr-racl] size = 0 Egress SUP [egr-sup] size = 0 Ingress Redirect [ing-redirect] size = 0
对更加进一步的islolate请去除UDLD和,但是连接失败工作
ARP请求出去N9K-2
N9K-2 # ethanalyzer local interface inband Capturing on inband2018-10-23 10:46:47.282551 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:47.286072 b0:aa:77:30:75:bf -> ff:ff:ff:ff:ff:ff ARP Who has 1.1.1.1? Tell 1.1.1.2 2018-10-23 10:46:49.284704 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:51.286150 b0:aa:77:30:75:bf -> ff:ff:ff:ff:ff:ff ARP Who has 1.1.1.1? Tell 1.1.1.2 2018-10-23 10:46:51.286802 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:53.288989 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:55.289920 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:57.292070 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:59.292568 1.1.1.1 -> 1.1.1.2 ICMP Echo (ping) request 2018-10-23 10:46:59.292818 b0:aa:77:30:75:bf -> ff:ff:ff:ff:ff:ff ARP Who has 1.1.1.1? Tell 1.1.1.2 10 packets captured
同带信号传输N9K-1# ethanalyzer的本地接口
Capturing on inband 2018-10-23 04:02:40.568119 b0:aa:77:30:75:bf -> ff:ff:ff:ff:ff:ff ARP Who has 1.1.1.1? Tell 1.1.1.2 2018-10-23 04:02:40.568558 cc:46:d6:af:ff:bf -> b0:aa:77:30:75:bf ARP 1.1.1.1 is at cc:46:d6:af:ff:bf2018-10-23 04:02:48.574800 b0:aa:77:30:75:bf -> ff:ff:ff:ff:ff:ff ARP Who has 1.1.1.1? Tell 1.1.1.2 2018-10-23 04:02:48.575230 cc:46:d6:af:ff:bf -> b0:aa:77:30:75:bf ARP 1.1.1.1 is at cc:46:d6:af:ff:bf————arp reply packet sent by agg1.
N9K-2的伊拉姆有自N9K-1的ARP响应
注意: 请与Cisco TAC联系验证伊拉姆捕获
module-2(TAH-elam-insel6)# reprort Initting block addresses SUGARBOWL ELAM REPORT SUMMARY slot - 2, asic - 1, slice - 0 ============================ Incoming Interface: Eth2/2 Src Idx : 0x42, Src BD : 4489 Outgoing Interface Info: dmod 0, dpid 0 Dst Idx : 0x0, Dst BD : 4489Packet Type: ARP Dst MAC address: B0:AA:77:30:75:BF Src MAC address: CC:46:D6:AF:FF:BF Target Hardware address: B0:AA:77:30:75:BF --------------------------------------- Arp packet captured on Linecard Sender Hardware address: CC:46:D6:AF:FF:BF Target Protocol address: 1.1.1.2 Sender Protocol address: 1.1.1.1 ARP opcode: 2Drop Info: module-2(TAH-elam-insel6)#
烦扰ping仍然出故障
N9K-2# ping 1.1.1.1 PING 1.1.1.1 (1.1.1.1): 56 data bytes 36 bytes from 1.1.1.2: Destination Host Unreachable Request 0 timed out 36 bytes from 1.1.1.2: Destination Host Unreachable Request 1 timed out 36 bytes from 1.1.1.2: Destination Host Unreachable Request 2 timed out 36 bytes from 1.1.1.2: Destination Host Unreachable Request 3 timed out 36 bytes from 1.1.1.2: Destination Host Unreachable
N9K-2# show ip arp | inc 1.1.1.1———arp not getting populated
要查出arp问题请添加静态ARP条目并且禁用UDLD
在静态arp ping从1.1.1.2到1.1.1.1开始工作,但是后再将发生故障,如果UDLD是启用的
N9K-2(config)# ping 1.1.1.2 PING 1.1.1.2 (1.1.1.2): 56 data bytes 64 bytes from 1.1.1.2: icmp_seq=0 ttl=255 time=0.32 ms 64 bytes from 1.1.1.2: icmp_seq=1 ttl=255 time=0.285 ms 64 bytes from 1.1.1.2: icmp_seq=2 ttl=255 time=0.282 ms 64 bytes from 1.1.1.2: icmp_seq=3 ttl=255 time=0.284 ms 64 bytes from 1.1.1.2: icmp_seq=4 ttl=255 time=0.291 ms
虽则ping在接口运作UDLD错误将被看到,当启用
没有CoPP丢包如下所示
N9K-2# show hardware internal cpu-mac inband active-fm traffic-to-sup Active FM Module for traffic to sup: 0x00000016———————————————————————————Module 22. N9K-2# show policy-map interface control-plane module 22 | inc dropp dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes; dropped 0 bytes;
掠夺者
(读的高亮度显示)
往Sup的活动FM是模块22。在命令之下运行的Toverify
module-30# show mvdxn internal port-status Switch type: Marvell 98DXN41 - 4 port switch Port Descr Enable Status ANeg Speed Mode InByte OutByte InPkts OutPkts -- -------------------- ------ ------ ---- ----- ---- ---------- ---------- ---------- ---------- 6 Local AXP CPU Yes UP No 2 6 781502852 1006219901 6868852 3506128 7 This SC BCM EOBC switch Yes UP No 2 6 654791960 430206276 1833465 3523170 8 Other SC BCM EOBC switch Yes DOWN No 2 6 72282 176 3 2 9 This SC EPC switch Yes UP No 2 6 351355874 351309506 1672662 3345683 Switch type: Marvell 98DXN11 - 10 port switch Port Descr Enable Status ANeg Speed Mode InByte OutByte InPkts OutPkts -- -------------------- ------ ------ ---- ----- ---- ---------- ---------- ---------- ---------- 0 FM6 EPC switch Yes DOWN No 2 6 0 0 0 0 1 FM5 EPC switch Yes DOWN No 2 6 0 0 0 0 2 SUP ALT EPC Yes DOWN No 2 6 0 0 0 0 3 SUP PRI EPC Yes DOWN No 2 6 0 0 0 04 FM4 EPC switch Yes DOWN No 2 6 0 0 0 0 5 FM3 EPC switch Yes DOWN No 2 6 0 0 0 0 6 FM2 EPC switch Yes DOWN No 2 6 0 0 0 0 7 FM1 EPC switch Yes DOWN No 2 6 0 0 0 0 8 Other SC EPC switch Yes UP No 2 6 351356399 351310095 1672664 3345687 9 Local SC 4-port switch Yes UP No 2 6 351310031 351356399 3345688 1672664 Rule Rule_name Match_ctr Pol_en Pol_idx inProfileBytes outOfProfileBytes ---- -------------------- -------------------- ------ ------- -------------------- --------------------
往Sup的活动FM是模块22。Toverify运行下面的命令module-30#显示mvdxn内部端口statusSwitch类型:Marvell 98DXN41 - 4端口连接孔Descr Enable (event)状态ANeg速度模式InByte OutByte Inpkts OutPkts-- -------------------- ------ ------ ---- ----- -------------------------------------------- 是6个本地AXP CPU没有2 6 781502852 1006219901 6868852 3506128 7是此SC BCM EOBC交换机没有2 6 654791960 430206276 1833465 3523170其他8个SC BCM是EOBC交换机DOWN此SC EPC交换机没有2 6 351355874 351309506 1672662个3345683Switch是键入的没有2 6 72282 176 3 2 9 :Marvell 98DXN11 - 10端口连接孔Descr Enable (event)状态ANeg速度模式InByte OutByte Inpkts OutPkts-- -------------------- ------ ------ ---- ----- -------------------------------------------- 0个FM6是EPC交换机DOWN没有2 6 0 0 0 0 1个FM5是EPC交换机DOWN没有2 6 0 0 0 0 2个SUP ALT EPC DOWN没有2 6 0 0 0 0 3个SUP PRI EPC是DOWN没有2 6 0 0 0 0 4个FM4 EPC交换机是DOWN没有2 6 0 0 0 0 5个FM3 EPC交换机是DOWN没有2 6 0 0 0 0 6个FM2 EPC交换机是DOWN没有2 6 0 0 0 0 7个FM1 EPC交换机是DOWN没有2 6 0 0 0 0 8其他SC EPC交换机是没有2 6 351356399 351310095 1672664 3345687 9本地SC是4端口交换机没有2 6 351310031 351356399 3345688 1672664Rule Rule_name Match_ctr Pol_en Pol_idx inProfileBytes outOfProfileBytes---- -------------------- -------------------- ------ ------- -------------------- --------------------
TCAM值设置对0原因丢弃在线路卡的所有控制数据流。
在更改TCAM值以后到默认值udld出现,并且arp获得解决
配置添加到N9K-2解决问题
重新加载在配置更改以后是需要的
N9K-2(config)# hardware access-list tcam region ing-sup 512 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# hardware access-list tcam region ing-racl 1536 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# hardware access-list tcam region ing-l2 ing-l2-qos ing-l2-span-filter N9K-2(config)# hardware access-list tcam region ing-l2-qos 256 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# hardware access-list tcam region ing-l3-vlan-qos 512 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# hardware access-list tcam region ing-l2 ing-l2-qos ing-l2-span-filter N9K-2(config)# hardware access-list tcam region ing-l2-span-filter 256 N9K-2(config)# hardware access-list tcam region ing-l3-span-filter 256 N9K-2(config)# hardware access-list tcam region span 512 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# hardware access-list tcam region egr-racl 1792 Warning: Please reload all linecards for the configuration to take effect N9K-2(config)# show run | grep tcam hardware access-list tcam region ing-redirect 0 N9K-2(config)# hardware access-list tcam region ing-redirect 256 Warning: Please reload all linecards for the configuration to take effect
有用的命令
Show hardware访问列表TCAM区域
show run |公司TCAM”-----输出不意味着TCAM设置为默认设置。
有用的链路