Bandwidth Utilization using a sniffer in 6513

Jul 17th, 2008

We currently have about 8 VLANS on our 6513 server farm switch. I'm seeing a lot of retransmission from server to server and I wanted to see if our network is being over utilize. If I span a VLAN I don't see that much effect on the bandwidth utilization. But if I span all the VLANS on the 6513 I see the bandwidth go up to 100% I'm not sure if this is the correct way to see it more accurately. Please advice.

The number of VLANs to SPAN would depend on if your servers are within the same vlan and if they are all connected directly through the 6513.

If all servers are not directly connected to the 6513 and exist in the same VLAN then one would only need to span that particular vlan.

If the servers are in one vlan and are connected to the 6513 through various switches and you have a different infrastructure vlan then you may want to span those two vlans to be sure you don't miss the retrans you are trying to see.

If you still don't see the traffic you are looking for then you may need to span all vlans, but if you're seeing 100% utilization you may have a bigger problem...

sundar.palaniappan Thu, 07/17/2008 - 16:05

You'd first need to identify the VLAN where most traffic is coming from. On your layer 3 switch check the VLAN interfaces and identify the interface (VLAN), you can look at txload/rxload or 5 min input/output rate to do that, and then SPAN that VLAN to see which host(s) or sending large volume of traffic and if all that's legit.



bauti1428 Thu, 07/17/2008 - 19:38

We are a big citrix and EMC shop. We are seperated from the citrix admins and the EMC admins. 50% to 60% of the traffic that I see are SMB which is our fileserver we use for profiles and users home directory. All the servers communicating to the fileserver has too many retransmission and I pointed this out to them but again they said there is no problem and this is normal since all their profiles and user directories are located in the fileserver. I believe that our bottleneck is with EMC. Imagine all those traffic going to 1 fiber to EMC? I would try to attach some images tomorrow when I get to work. We have been doing this blame game with the citrix, microsoft, emc admins. They always say it's the network.

bauti1428 Thu, 07/17/2008 - 16:17

All the VLANS that I spanned are all VLAN servers and they are all in the 6513. I first span each VLAN and each VLAN was ok not hitting 50% utilization. But after I tried spanning all the VLAN on the 6513, I saw the spike 20, 80 and 100% Most of the traffic was also from SMB which has a lot of retransmission.

bauti1428 Fri, 07/18/2008 - 05:37

I'm seeing 100% network utilization when I span all the VLANs on the 6513. The VLANS are all server VLANS and only enabled in the 6513.

sundar.palaniappan Fri, 07/18/2008 - 06:01

Do you have a basline utilization that you can compare with what you are seeing now.

I am not able to make up much from the utilization chart that you had posted. However, if you think the traffic level you see is way out of proportion I recommend check the traffic of top talkers on the network to see if all the traffic they are sending out is legit.

Another thing you can try is if any new application was added to the network try to disable it and check the utilization. I have seen some issues with some applications, when not fine tuned, causing serious problems on the network.



bauti1428 Fri, 07/18/2008 - 07:09

They are legit traffic microsoft-ds, citrix, SMB. Microsoft-ds is the no 1 top talkers in our network that makes 55 to 65% of the traffic. What I need to accomplish is to be able to tell them that the bottle neck is from EMC.

bauti1428 Fri, 07/18/2008 - 07:38

I'm not sure how I'm going to approach them. The last time I aproach them with all the graphs and finding that I have found, they started emailing me with spanning-tree, duplex mismatch, errors taken from cisco website. I thought it was funny.

sundar.palaniappan Fri, 07/18/2008 - 07:43

I hear you. Networks inherently would have some minor things going on and some folks would try to present that to be the trigger or reason for something much more serious that's happening. It's funny I just got done working on something similar yesterday. Anyway, all I can say is good luck with that.

jkeeffe Fri, 07/18/2008 - 10:39

Another possibility is that the servers are all connected to a line card where the ports share the same bus or ASIC. Until we moved to a 6748 line card we were experiencing lots of retransmissions on some of the ports. I think the 65xx line cards share 8 ports to one ASIC, and one server can choke out the other severs.

bauti1428 Fri, 07/18/2008 - 12:25

Do you have a 6748 line cards in place already? I believe this card is only compatible with sup720's. I hope there is somebody out there that can tell us that getting a new sup720 with those line cards fix their problem. :-) I was also thinking of those issue with port asics and their shared buffers among the 8 ports.


