本文介绍如何排除基本优化故障。
基本WAAS优化包括TCP流优化(TFO)、数据冗余消除(DRE)和持久Lempel-Ziv(LZ)压缩。
TCP连接数、其状态和性质可指示特定位置中WAAS系统的运行状况。正常系统将显示大量连接,其中相当大比例的连接正常关闭。show statistics tfo detail命令可指示特定WAAS设备与网络中其他设备之间连接的数量、状态和性质。
您可以使用show statistics tfo detail命令查看全局TFO统计信息,如下所示:
WAE# show statistics tfo detail
Total number of connections : 2852
No. of active connections : 3 <-----Active connections
No. of pending (to be accepted) connections : 0
No. of bypass connections : 711
No. of normal closed conns : 2702
No. of reset connections : 147
Socket write failure : 0
Socket read failure : 0
WAN socket close while waiting to write : 0
AO socket close while waiting to write : 2
WAN socket error close while waiting to read : 0
AO socket error close while waiting to read : 64
DRE decode failure : 0
DRE encode failure : 0
Connection init failure : 0
WAN socket unexpected close while waiting to read : 32
Exceeded maximum number of supported connections : 0
Buffer allocation or manipulation failed : 0
Peer received reset from end host : 49
DRE connection state out of sync : 0
Memory allocation failed for buffer heads : 0
Unoptimized packet received on optimized side : 0
Data buffer usages:
Used size: 0 B, B-size: 0 B, B-num: 0
Cloned size: 0 B, B-size: 0 B, B-num: 0
Buffer Control:
Encode size: 0 B, slow: 0, stop: 0
Decode size: 0 B, slow: 0, stop: 0
Scheduler:
Queue Size: IO: 0, Semi-IO: 0, Non-IO: 0
Total Jobs: IO: 1151608, Semi-IO: 5511278, Non-IO: 3690931
Policy Engine Statistics
-------------------------
Session timeouts: 0, Total timeouts: 0
Last keepalive received 00.5 Secs ago
Last registration occurred 15:00:17:46.0 Days:Hours:Mins:Secs ago
Hits: 7766, Update Released: 1088
Active Connections: 3, Completed Connections: 7183
Drops: 0
Rejected Connection Counts Due To: (Total: 0)
Not Registered : 0, Keepalive Timeout : 0
No License : 0, Load Level : 0
Connection Limit : 0, Rate Limit : 0 <-----Connection limit overload
Minimum TFO : 0, Resource Manager : 0
Global Config : 0, TFO Overload : 0
Server-Side : 0, DM Deny : 0
No DM Accept : 0
. . .
活动连接数字段报告当前正在优化的连接数。
在输出的Policy Engine Statistics部分,Rejected Connection Counts部分显示连接被拒绝的各种原因。Connection Limit计数器报告连接因超出最大优化连接数而被拒绝的次数。如果此处显示高数,您应查看过载情况。有关详细信息,请参阅“排除过载情况故障”一文。
此外,对于从其他AO下推而无法优化流量的连接,TFO优化由通用AO处理,该部分在“排除通用AO故障”一文中介绍。
您可以使用show statistics connection命令查看TFO连接统计信息。有关使用此命令的详细信息,请参阅排除过载情况故障文章中的“检查优化的TCP连接”部分。
当预期应用加速但未观察到应用加速时,请验证是否对流量应用了适当的优化,以及DRE缓存是否正在适当减小优化流量的大小。
DRE和LZ优化的策略引擎映射包括:
各种情况都可能导致DRE和/或LZ不应用于连接,即使已配置:
注意:在上述所有情况下,show statistics connection命令将报告“TDL”加速,以用于此为协商策略的连接。查看DRE或LZ绕行流量的大小将告诉您DRE或LZ优化是否实际应用。使用show statistics connection conn-id命令(如后所述),并查看DRE编码号,查看DRE或LZ比率是否接近0%,并且大部分流量被绕过。前三个条件将由“Encode bypass due to”字段报告,最后三个条件由流量数据模式产生,并在报告的DRE和LZ比率中予以说明。
您可以查看特定连接的统计信息,以确定已配置、与对等体协商并通过使用show statistics connection conn-id命令应用的基本优化。首先,您需要使用show statistics connection命令确定特定连接的连接ID,如下所示:
WAE#show stat conn Current Active Optimized Flows: 1 Current Active Optimized TCP Plus Flows: 0 Current Active Optimized TCP Only Flows: 1 Current Active Optimized TCP Preposition Flows: 0 Current Active Auto-Discovery Flows: 0 Current Reserved Flows: 10 Current Active Pass-Through Flows: 0 Historical Flows: 375 D:DRE,L:LZ,T:TCP Optimization RR:Total Reduction Ratio A:AOIM,C:CIFS,E:EPM,G:GENERIC,H:HTTP,M:MAPI,N:NFS,S:SSL,V:VIDEO ConnID Source IP:Port Dest IP:Port PeerID Accel RR 343 10.10.10.10:3300 10.10.100.100:80 00:14:5e:84:24:5f T 00.0% <------
您将找到输出末尾列出的每个连接的连接ID。要查看特定连接的统计信息,请使用show statistics connection conn-id命令,如下所示:
WAE# sh stat connection conn-id 343
Connection Id: 343
Peer Id: 00:14:5e:84:24:5f
Connection Type: EXTERNAL CLIENT
Start Time: Tue Jul 14 16:00:30 2009
Source IP Address: 10.10.10.10
Source Port Number: 3300
Destination IP Address: 10.10.100.100
Destination Port Number: 80
Application Name: Web <-----Application name
Classifier Name: HTTP <-----Classifier name
Map Name: basic
Directed Mode: FALSE
Preposition Flow: FALSE
Policy Details:
Configured: TCP_OPTIMIZE + DRE + LZ <-----Configured policy
Derived: TCP_OPTIMIZE + DRE + LZ
Peer: TCP_OPTIMIZE + DRE + LZ
Negotiated: TCP_OPTIMIZE + DRE + LZ <-----Policy negotiated with peer
Applied: TCP_OPTIMIZE + DRE + LZ <-----Applied policy
. . .
“应用名称”和“分类器名称”字段告诉您应用于此连接的应用和分类器。
优化策略列在“策略详细信息”部分。如果已配置和已应用的策略不匹配,则意味着您为此类连接配置了一个策略,但应用了不同的策略。这可能是对等体关闭、配置错误或过载的结果。检查对等WAE及其配置。
以下输出部分显示与DRE编码/解码相关的统计信息,包括已应用DRE、已应用LZ或已绕过DRE和LZ的消息数:
. . .
DRE: 353
Conn-ID: 353 10.10.10.10:3304 -- 10.10.100.100:139 Peer No: 0 Status: Active
------------------------------------------------------------------------------
Open at 07/14/2009 16:04:30, Still active
Encode:
Overall: msg: 178, in: 36520 B, out: 8142 B, ratio: 77.71% <-----Overall compression
DRE: msg: 1, in: 356 B, out: 379 B, ratio: 0.00% <-----DRE compression ratio
DRE Bypass: msg: 178, in: 36164 B <-----DRE bypass
LZ: msg: 178, in: 37869 B, out: 8142 B, ratio: 78.50% <-----LZ compression ratio
LZ Bypass: msg: 0, in: 0 B <-----LZ bypass
Avg latency: 0.335 ms Delayed msg: 0 <-----Avg latency
Encode th-put: 598 KB/s <-----In 4.3.3 and earlier only
Message size distribution:
0-1K=0% 1K-5K=0% 5K-15K=0% 15K-25K=0% 25K-40K=0% >40K=0% <-----In 4.3.3 and earlier only
Decode:
Overall: msg: 14448, in: 5511 KB, out: 420 MB, ratio: 98.72% <-----Overall compression
DRE: msg: 14372, in: 5344 KB, out: 419 MB, ratio: 98.76% <-----DRE compression ratio
DRE Bypass: msg: 14548, in: 882 KB <-----DRE bypass
LZ: msg: 14369, in: 4891 KB, out: 5691 KB, ratio: 14.07% <-----LZ compression ratio
LZ Bypass: msg: 79, in: 620 KB <-----LZ bypass
Avg latency: 4.291 ms <-----Avg latency
Decode th-put: 6946 KB/s <-----In 4.3.3 and earlier only
Message size distribution:
0-1K=4% 1K-5K=12% 5K-15K=18% 15K-25K=9% 25K-40K=13% >40K=40% <-----Output from here in 4.3.3 and earlier only
. . .
上述示例中突出显示了以下编码和解码统计信息:
如果看到大量旁路流量,DRE压缩比将小于预期。可能是由加密流量、小消息或其他不可压缩的数据造成的。考虑联系TAC以获得进一步的故障排除帮助。
如果您看到大量LZ绕行流量,这可能是由于大量加密流量(通常不可压缩)造成的。
平均延迟数对调试吞吐量问题非常有用。根据平台,编码和解码的平均延迟通常都为毫秒的单位数。如果用户遇到低吞吐量且其中一个或两个数字较高,则表示编码或解码存在问题,通常在延迟较高的一侧。
使用show statistics dre detail命令查看DRE统计数据(如最旧的可用数据、缓存大小、使用的缓存百分比、使用的哈希表RAM等)可能会很有用,如下所示:
WAE# sh stat dre detail
Cache:
Status: Usable, Oldest Data (age): 10h <-----Cache age
Total usable disk size: 311295 MB, Used: 0.32% <-----Percent cache used
Hash table RAM size: 1204 MB, Used: 0.00% <-----Output from here is in 4.3.3 and earlier only
. . .
如果您没有看到显着的DRE压缩,可能是因为DRE缓存中未填充足够的数据。检查缓存时间是否短,是否使用的缓存不足100%,这表示出现这种情况。当缓存填充更多数据时,压缩比应该提高。如果100%的缓存已使用,且缓存时间较短,则表明WAE的大小可能过小,无法处理流量。
如果您没有看到显着的DRE压缩,请查看命令输出的以下部分中的Nack/R-tx计数器:
Connection details:
Chunks: encoded 398832, decoded 269475, anchor(forced) 43917(9407) <-----In 4.3.3 and earlier only
Total number of processed messges: 28229 <-----In 4.3.3 and earlier only
num_used_block per msg: 0.053597 <-----In 4.3.3 and earlier only
Ack: msg 18088, size 92509 B <-----In 4.3.3 and earlier only
Encode bypass due to: <-----Encode bypass reasons
remote cache initialization: messages: 1, size: 120 B
last partial chunk: chunks: 482, size: 97011 B
skipped frame header: messages: 5692, size: 703 KB
Nacks: total 0 <-----Nacks
R-tx: total 0 <-----Retransmits
Encode LZ latency: 0.133 ms per msg
Decode LZ latency: 0.096 ms per msg
. . .
Nacks和R-tx计数器通常应比流量低。例如,每100 MB原始(未优化)流量约1个。如果您看到的计数明显更高,则可能表示DRE缓存同步问题。使用clear cache dre命令清除所有设备上的DRE缓存,或与TAC联系。
编码绕行原因计数器报告由于各种原因而绕过的字节数。这有助于您确定导致旁路流量的原因(非可优化数据模式)。
有时,识别已连接和活动的对等WAE并查看对等体统计信息会很有帮助,您可以使用show statistics peer dre命令进行如下操作:
WAE# sh stat peer dre
Current number of connected peers: 1
Current number of active peers: 1
Current number of degrade peers: 0
Maximum number of connected peers: 1
Maximum number of active peers: 1
Maximum number of degraded peers: 0
Active peer details:
Peer-No : 0 Context: 65027
Peer-ID : 00:14:5e:95:4a:b5
Hostname: wae7.example.com <-----Peer hostname
------------------------------------------------------------------------------
Cache: Used disk: 544 MB, Age: 14d23h <-----Peer cache details in 4.3.3 and earlier only
Cache: Used disk: 544 MB <-----Peer cache details in 4.4.1 and later only
Peer version: 0.4 <-----
Ack-queue size: 38867 KB |
Buffer surge control: |<---In 4.3.3 and earlier only
Delay: avg-size 0 B, conn: 0, flush: 0 |
Agg-ft: avg-size 20902 B, conn: 388, flush: 0 |
remote low-buff: 0, received flush: 0 <-----
Connections: Total (cumulative): 3226861, Active: 597
Concurrent Connections (Last 2 min): max 593, avg 575
. . .
此命令的其他输出显示类似于单个连接的编码和解码统计信息。