"Datacenter troubleshooting guide” – a blog by Gilles Dufour.
Day 7 - Understanding me-sats (continued)
Let me start by wishing you all a Happy new Year !
In my previous post I talked about the undocumented but very useful ME-STATS.
Today I'm going to continue the discussion and look at stats for the ICM module.
After the RX module buffered the packet, and Fastpath decided it does not match an existing connection, the packet is sent to the Inbound Connection Manager (ICM) which will perform the L3 rule selection and decides if the packet should be sent to the LB module for loadbalancing or to the TCP module if we need to spoof the connection, or to the OCM module if the packet should be nated.
Since the number of stats for this command is huge, I removed some of them from the list below.
switch/Admin# show np 1 me-stats "-sicm -v"
ICM Statistics (Current)
Errors: 0 0
Frames Received: 695245 5
Drop [unknown msg]: 0 0
IPCP Received: 5 0
Embryonic Hit Received: 0 0
Close Receive: 1362835 4
Close Drop unknown msg: 0 0
Close Errors: 0 0
Close Connection timeout: 3241 1
Close IPCP send stat: 0 0
Close IPCP recv stat: 0 0
Encaps Miss Success stat: 0 0
Encaps Miss Error stat: 0 0
Close No interface on connection: 0 0
Close connection [Interface down]: 0 0
Reuse link update conn not on reuse erro 0 0
Reuse conn remove not on head error: 0 0
Drop [Next-Hop queue full]: 0 0
Close Error not in hash: 0 0
Invalid reap messages: 0 0
If lookup error: 5 0
UDP Chaser sent, conn miss: 0 0
UDP Chaser sent, partial conn: 0 0
(Context ALL Statistics)
Transmit -> fastpath: 13696 3
Transmit -> TCP: 94582 0
Transmit -> OCM: 1096 0
Send -> LB_L4: 580982 0
Send -> Other IXP: 202 0
Drop [redundant]: 0 0
Drop [ACL deny]: 4772 2
Drop [Connection RL]: 0 0
Drop [CP Connection RL]: 0 0
Drop [Proxy RL]: 0 0
Drop [SSL RL]: 0 0
Drop [Connection Rate RL]: 0 0
Drop [Inspect Rate RL]: 0 0
Drop [IF FT Standby]: 0 0
Drop [ICMP Hard Error]: 0 0
Drop [ICMP Redirect]: 0 0
Drop [ICMP Error IP Mismatch]: 0 0
Connection [Inserts]: 688907 2
Connection [Deletes]: 784574 4
Connection [Modifies]: 0 0
Proxy [Inserts]: 0 0
Proxy [Deletes]: 772272 0
IPCP Sent: 5 0
CP Init Received: 5925 1
Invalid conn miss TCP flags: 0 0
RPF check Error: 0 0
Route lookup Error: 0 0
MAC Lookup Error: 4 0
My mac check Error: 108 0
Bridged - My mac Error 0 0
BVI invalid/down Error 0 0
Classify Error: 0 0
Transmit Encap Miss Msg stat: 181 0
Drop [Encap Miss Msg stat]: 0 0
Close Connection with invalid proxy: 0 0
Pinhole deletes: 0 0
Tracker Unlinks : 0 0
Connection Reuse Add Errors: 0 0
Connections Removed From Reuse Pools: 0 0
Connections Added To Reuse Pools: 0 0
Replicate Connection encap lookup error: 0 0
Replicate Connection MAC lookup error: 0 0
Replicate connection sent: 0 0
Replicate connection msg to other ixp: 0 0
Replicate connection recv L4: 0 0
Replicate connection recv LB: 0 0
Replicate connection recv buddy: 0 0
Drop [Replicate conn buddy - no control 0 0
Close IPCP errors: 0 0
Close connection tracker not found error 0 0
As you can see, there are counters for each Send/Transmit destinations. TCP and LB_L4 are considered "slow path" since we do not know yet the final destination of the packet.
Just like inside FastPath there is a "Drop [Next-Hop queue full]" counter which indicate if other destinations are too slow processing their input queue preventing ICM to transmit new packets. Those packets get dropped.
The two encaps counters ("Encaps Miss Success stat" and "Encaps Miss Error stat" ) are related to an interesting behavior of the ACE platform.
Like its predecessor the CSM - (Content Switching Module), ACE uses "encap ids" internally to reference mac-addresses.
Therefore, internally, a connection entry will be using encap ids which reference specific mac-addresses.
You can see the mapping between an encap id and a mac-address by doing a 'show arp'.
switch/Admin# show arp
IP ADDRESS MAC-ADDRESS Interface Type Encap NextArp(s) Status
10.86.213.206 00.07.4f.ce.d6.00 vlan10 LEARNED 30 8999 sec up
10.86.213.250 00.c0.9f.4f.fe.d1 vlan10 LEARNED 19 8995 sec up
18.104.22.168 00.0b.fc.fe.1b.64 vlan10 NAT LOCAL _ up
10.86.213.1 00.00.0c.07.ac.00 vlan10 GATEWAY 20 297 sec up
10.86.213.2 00.11.5d.e1.2f.fc vlan10 LEARNED 22 8995 sec up
10.86.213.16 00.0a.8a.7d.5f.38 vlan10 LEARNED 43 9631 sec up
10.86.213.38 00.09.b22.214.171.124 vlan10 LEARNED 11 8994 sec up
10.86.213.40 00.30.f2.75.f3.f1 vlan10 INTERFACE LOCAL _ up
10.86.213.53 00.0b.fc.fe.1b.64 vlan10 VSERVER LOCAL _ up
10.86.213.54 00.0b.fc.fe.1b.64 vlan10 NAT LOCAL _ up
10.10.20.1 00.0b.fc.fe.1b.64 vlan30 NAT LOCAL _ up
10.1.1.1 00.0b.fc.fe.1b.64 vlan30 NAT LOCAL _ up
192.168.30.10 00.1b.24.65.af.66 vlan30 LEARNED 39 9028 sec up
192.168.30.11 00.1b.24.4d.eb.a6 vlan30 LEARNED 37 9024 sec up
192.168.30.17 00.e0.81.22.78.ed vlan30 STATIC 9 _ up
In the output above, we can see that mac-address 00.07.4f.ce.d6.00 is mapped to encap id 30.
When a packet comes to ICM from an unknown mac-address an internal (IPCP) message is sent to the Control Plane so that we can trigger an arp request to populate the arp table and obtain an encap id for the new mac-address.
When this process succeeds we increment "Encaps Miss Success stat", but when it fails the packet is dropped and the counter "Encaps Miss Error stat" is incremented.
One reason for getting an encap miss error is when you reach the limit of mac miss rate.
You can check your current rate and the limit with the following command.
switch/Admin# show resource usage | i mac
mac-miss rate 1 5 0 700 0
Now, let's examine the DROP counters.
We have a serie of RL (Resource Limit) counters for all the resources that we monitor. This includes the "concurrent connections", the "SSL connections", "proxy connections", "CP concurrent connections", ...
We also drop new connections when we are in standby mode "Drop [IF FT Standby]" or when traffic matched a deny ACL "Drop [ACL deny]".
A more interesting counter is the "Drop [redundant]" one.
It has nothing to do with fault tolerance and redundancy.
This counter actually means we received a new connection request for a connection that is already being processed.
This can usually happen for very fast UDP traffic.
The first UDP packet requires ICM to create a new connection so that further UDP packets for that same flow can be fast switched.
If the next packet comes in before ICM is done processing the first packet, it is dropped and this counter is incremented.
For very fast UDP traffic, if you have problems with redundant drops, you should consider enabling UDP Booster. I'll cover this feature in a future post.
Next module is TCP but it will be for my next post.
I hope you'll find this information useful.