Performance issue on LAN ( using GLBP and RSTP)

Unanswered Question
Nov 2nd, 2009
User Badges:

Hi all,


I am facing Performance problems in our LAN.

We have 6 Vlans , one of which is server vlan others are users Vlans.

Problem is that when i ping from users Vlan to Servers Vlan i found high latency ( like 20 to 30 msec)

We are using two catalyst 4506 at core Connected with each other using 2Gbps etherchannel and dosens of L2 switches( L2)Connected with both Core switches.

Also we are using GLBP for load balancing .

I think this is simple configurations but i am stuck as no way has been found to get rid of high latency.

Also worth mentioning that some time ping response is normal but most of the time there is trouble.


Can seome body please help me.

Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joseph W. Doherty Mon, 11/02/2009 - 04:29
User Badges:
  • Super Bronze, 10000 points or more

Have your verified that your high pings are not caused by the device being pinged being slow to respond? (For instance, Cisco devices treat ping responses with low priority. On them you need to use something like an SLA responder to increase accuracy.)


Have also you verified there is no intermittent congestion on any of your switch downlinks and/or to the 4500 ports. (The former can be more of an issue if you have servers on a downlinked switch with a highly oversubscribed uplink. The latter, the original 4500 chassis only provides 6 Gbps per slot, how this is allocated to card ports depends on the card. Some ports can be highly oversubscribed.)


[edit]

PS:

What's latency between devices on the same vlan that physically transit same devices as intervlans do?


PPS:

More information on the evironment is often helpful. E.g. what sup and line cards are you using in the 4506? What are your L2 switches? Where are various devices and hosts connected (especially servers and network devices)? Is routing runing on both core 4506s? Any chance of unicast flooding? What hashing method is being used on the 4506 Etherchannel? Is it L2 or L3 or both? What links should be blocked with STP?

lgijssel Mon, 11/02/2009 - 04:39
User Badges:
  • Red, 2250 points or more

Have you verified your port settings?

Ports must be set to auto speed / auto duplex.

Server load balancing may also introduce this kind of problems.

If so, please revert to fail-over mode and test again.


regards,

Leo

imranfbhatti Mon, 11/02/2009 - 06:16
User Badges:


Same issue on same vlan ( some time does not) i am unable to catch any pattern


Supervisor V with 2 WS-X4448-GB-RJ45 line cards

Layer 2 switches includes ( 2950 , 2960 and 3548 switches)

Normally we used all dell machines with Dell servers and one IBM Blade center


Routing has been performed on both 4506 switches ( as we are using glbp)


One 4506 is the root bridge for 3 vlans and other 4506 is the root bridge of remaining vlans


No hashing for Etherchannel it is only layer 2 etherchannel


We have core and access layer model so out of 2 trunk ports one is in blocking mode at access layer


Giuseppe Larosa Mon, 11/02/2009 - 07:07
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Imran,

check with

sh platform health


how much traffic is going on.


Be aware that these linecards have limited performances.

For examples WS-4548 have 6 Asics with each chip serving a group of 8 Ge ports.


So you have a 8:1 ratio.


WS-4448 being older cannot be better.


Hope to help

Giuseppe


vdadlaney Mon, 11/02/2009 - 08:26
User Badges:

Hi, could you give some more details abt the environment.


1. How many uplinks do you have from the access switch to the core.


2. Are u seeing any drops on the uplinks.


3. Do you have VLANs per access switch or are do you have them spanned across the switches.


4. What kind of physical topology is this. U or V etc


5. from a previous post I understand that you said the problem is on the same VLAN. Are the machines that you are having problems with connected to the same switch or to different switches


Thx

imranfbhatti Mon, 11/02/2009 - 10:52
User Badges:

Hi Vdadlaney

Thanks for your reply.


Let me answer one by one


1. we use 2 uplinks on each access layer switches, one connected to first core switch and second connected to second core switch


2. Some time i noticed that there is transmit discards ( Output drops on uplinks)

3. Vlans are spanned across the switches


4. cannot understand U or V topology ??

two core switches and each access layer switch is connected with one of the core switch.


5. Actually problem is not in the same vlan but also intervlan.

in case of same vlan problem occurs when connected to different switches.

Thanks


Joseph W. Doherty Mon, 11/02/2009 - 11:07
User Badges:
  • Super Bronze, 10000 points or more

"Some time i noticed that there is transmit discards ( Output drops on uplinks)


in case of same vlan problem occurs when connected to different switches."


From information you provided so far, you might have transient congestion on uplinks either due to oversubscription on the link bandwidth and/or oversubscription on your 4448 ports. Further, the fact you've spanned VLANs across switches, have two L3 switches running routing, using GLBP, again makes me wonder whether you have some occasional unicast flooding (see case #1 in http://www.cisco.com/en/US/products/hw/switches/ps700/products_tech_note09186a00801d0808.shtml).

vdadlaney Mon, 11/02/2009 - 12:34
User Badges:

Hi Imran,


As Joseph has suggested this could most likely be a oversubscription issue.


1. How many Switches are you spanning the VLAN across?


2. What does your show spanning-tree output look like for a VLAN on the switches.


Most likely since this is a V-shaped topology from an access switch perspective I have found that GLBP might not help if spanning VLANs across to another switch since one of the uplinks will be blocking hence u will be utilizing only one of the uplinks and the traffic pattern will be quite awkward. Please feel free to correct me should the above be incorrect. Thx

imranfbhatti Tue, 11/03/2009 - 00:55
User Badges:

Hi vdadlaney


I d,nt think this is any oversubscription issue as i have not seen much traffic on the uplink/trunk ports.

Attached is the spanning tree outt put from access layer switch.

also sh platform health is attached as i think cpu process is not an issue



imranfbhatti Tue, 11/03/2009 - 01:01
User Badges:

Dear Joseph, In our case routing has been performed on the both 4506 switches.

In case of 4506 how can we check unicast flooding?


If unicast flooding exists , how to avoid this ?

vdadlaney Tue, 11/03/2009 - 08:14
User Badges:

Hi Imran,


Per your spanning-tree output it seems that VLAN4 is the only one with active ports. In that VLAN you have one of the uplinks in a blocking state. Can you provide an output of "show interface G0/1" and "show counters G0/1" or an equivalent command that gives the errors on that port. In addition in VLAN 4 you have a port in "shr" status. Can you confirm what is connected to that port? thx

imranfbhatti Tue, 11/03/2009 - 08:57
User Badges:

Dear vadadlaney

Thank you very much for your support


attached below is the output of sh int


port with shr status is down now but i think this is linksys wireless access point ( i am not sure)


I have not seen any error on this port rather i observe some errors on core switch ( output drop) same Port is connected to above switch


today we got lot of errors on solarwinds

and when i check on ports , errors are output drops

Thanks




Attachment: 
Giuseppe Larosa Tue, 11/03/2009 - 10:15
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Imran,

I've looked at your attachment files, sh platform health and errors web page.


I think what we see confirms that you are facing some output drops caused by performance limitations of linecards in use.


platform health shows average BW usage for linecard 2 at 72%.


the errors web page show some non-zero tx errors on selected ports on module 2.


to be noted ports that are in the same 8ports group are affected in the same way.


Also from a practical point of view if the ratio between sent packets and output drops in all ports is like that we see in gi2/3:


53239845 packets output

Total output drops: 4168


means a drop probability of

7,828 e-5


I would say you are still fine and that these errors should not impact performances severely.


There are other aspects also to be considered like

Data traffic may get dropped when Dynamic Buffer Leaking (DBL) is enabled on Catalyst 4500 switches. This problem may manifest as performance issues with TCP-based applications.


http://www.cisco.com/en/US/docs/switches/lan/catalyst4500/release/note/OL_5184.html


CSCsk07525


unicast flooding can play a role if as Joseph suggests you can see unexpected unicast traffic on a switch port.


Hope to help

Giuseppe


Joseph W. Doherty Tue, 11/03/2009 - 09:04
User Badges:
  • Super Bronze, 10000 points or more

"In case of 4506 how can we check unicast flooding? "


Sniff your LAN looking for unicast traffic that is where you wouldn't expect it. I.e. on any port except the connected host ports and/or ports that interconnect switches for those two ports.


"If unicast flooding exists , how to avoid this ?"


Identify the cause, then fix or avoid the cause. For instance, if the issue is asymetrical routing, insure one L3 core switch is gateway for all VLANs.

imranfbhatti Tue, 11/03/2009 - 09:13
User Badges:

"insure one L3 core switch is gateway for all VLANs."


so how do we insure the redundancy in this case .

secondly in this case HSRP is better than GLBP ? or one l3 switch for routing and same switch is the root bridge for all vlans ?


Joseph W. Doherty Tue, 11/03/2009 - 09:43
User Badges:
  • Super Bronze, 10000 points or more

"so how do we insure the redundancy in this case . "


As you've already guessed, HSRP rather than GLBP.


"secondly in this case HSRP is better than GLBP ?"


It could be if in fact you do have asymetrical routing, and moving from GLBP to HSRP precludes it.


There may be other approaches, such as adjust MAC timers, but unless you really need the performance of a 2nd gateway (do you?), I think the primary/secondary approach is a bit simpler.


"or one l3 switch for routing and same switch is the root bridge for all vlans ? "


It probably would make sense for the primary L3 switch to the root bridge.


PS:

Do know, don't know this is your problem. You haven't posted enough detail information, but with two core L3 switches and GLBP, it might be an issue (also with you spanning VLANs across multiple switches - not doing so allows unicast flooding to be a non-issue, but that might be an even bigger topology change for you).

Actions

This Discussion