nexus 7k latency issue - please any help would be appreciated

bob brennan · ‎09-27-2013

we have been troubleshooting slow performance on our n7k for almost 2 months with cisco tac and netapp. the case hasd been elevated to level 1 and we are no closer to an answer on why using certain applications like netapp snap mirroring runs much slower when having the source on an m1 series card and the destination on an f1 series card and vice verse. for some reason we are seeing latency in the 150 ms range. we have checked the qos, copp, counters, drops, errors and found nothing. we have replaced the f1 and m1 series cards and still no resolution. if we connect both source and destination to an m-series card we don't have this issue. below is the last set of tests and the response we received from the netapp and cisco engineer's. cisco has created a lab to try and duplicate the issue but it has not worked.

has anyone run across a similiar issue with the nexus 7k ?

netapp

Hello all,

Here’s some initial numbers. I have average values as well as a specific example.

I will continue to analyze for more details and behavior tomorrow, but thought this would be a good starting point.

High-level numbers

Source-Side 5k SPAN avg. RTT: 145ms ¹

Destination-Side 5k SPAN avg. RTT: 0.1ms ²

NetappDR4 average avg. RTT: 0.0ms ³

¹Theoretically, this number should be the same or smaller than the Netapp5i capture average. Therefore, this number can be viewed as the true average Round-Trip Time (RTT) seen by the source for this replication stream.

²This number is an RTT value but can be considered the full Service Response Time (SRT) of the destination NetApp storage controller NetappDR4. This is on average how long netappDR4 took to acknowledge packets it received.

I use TCPTrace to calculate the average RTTs, but it only calculates down to the hundreds of microseconds (100us). I don’t know whether it rounds up or not, so erring on the side of caution this is somewhere between 50us and 199us RTT from NetappDR4 NIC to Application and back down again.

³ This is NetappDR4’s SRT minus NIC and NIC driver processing time which is somewhere between 1us and 99us.

Specific example

Sequence number: 3085847658 (next sequence number: 3085849106)

RTT on Netapp5i: 144ms

RTT on Source-Side 5k SPAN: 143ms

RTT (really this is SRT) on Destination-Side 5k SPAN: 100us (100 microseconds on the nose)

RTT (SRT minus NIC/NIC Driver processing time) on NetappDR4: 85us

The difference between the Destination-Side SPAN and NetappDR4’s capture is 15us for this packet. This represents the time it took NetappDR4’s NIC and NIC Driver to process the packet both ways.

This initial data seems to support that the high round-trip times / latency is not due to NetappDR4.

cisco

Hi Bob,

We are still evaluating possible reasons on why the packets would be buffered on the 7K switch.

We are also putting a lab together to recreate and test possible workarounds that would make significant impact without loss of any functionality. This would take us a few days.

The possible theory at this point is that the M1 line card is buffering the packets on ingress. To explain this further we will touch on a few important aspects of the Nexus platform.

Unicast packet is considered as a credited packet while multicast is uncredited in a 7K system.

On the Nexus 7K, a unicast (credited) packet is moved from ingress to egress through the switching fabric only if the egress port has sufficient buffering available.

The Arbiter ASIC is responsible in tracking available buffers. Based on free egress buffers, the Arbiter accepts or denies the request.

A multicast packet is sent straight through without any arbitration.

All this is done at extremely high switching rate with dedicated asics built for these function.

M1 module has deeper ingress and egress buffers. Due to this, the arbitration logic in the M1-M1 flow ensures that the stream is sent from ingress to egress smoothly due to the availability of the buffers on the egress.

The F1 line card is a low latency card and has shallow egress buffers. For the M1-F1 flow, the ingress module (M1) will have hold the packet longer while the egress buffers are being freed up. The slow dequeueing of packets on the egress is potentially causing the delay.

The lab test will help us verify this further and validate out theory. We will let you know on our findings.