cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
485
Views
0
Helpful
8
Replies

HSRP Problem

lejack99
Level 1
Level 1

My core router is flapping some of the vlan int to the standby router quite frequently. It must be something that prevent the hello message to send out on the active router or the receiving standby router. I see a big input queue drops on that vlan interface on both router.

Please help!

1 Accepted Solution

Accepted Solutions

Under normal condition in most of the environments the default loadbalancing technique achieves optimum utilization. However here you can see only one interface being used in the channel, I doubt that traffic which is causig over utilization is not legitimate one.

First of all you have to sniff the taffic with SPAN/RSPAN then we can come to some concusion.

I dont think it is a good idea to change the channel loadbalancing technique without concrete analysis on your traffic patterns.

As I told in my previuos post, traffic need not be genrated by core switch...even a DOS attack can make you to run in to such trouble......Only sniffing the traffic is next step.

View solution in original post

8 Replies 8

subbarao.s
Level 1
Level 1

Could explain more on how core switches are configured and interconnected? With VLAN trunk (Bandwidth), H/W model, etc....

Sorry for the lack of info.

2 core routers on each Cat6509 switches, which means 4 cores are doing that HSRP. Example below for vlan 100...

CAT 6509

Core 1 : HSRP Priority 100

Core 2 : HSRP Priority 90

CAT 6509

Core 3 : HSRP Priority 80

Core 4 : HSRP Priority 70

Those cores are trunked to the switches and switches are interconnnected with GEC.

Also, EIGRP error shows up as well. Some peers are restarted.

From the above, Core 1 will be active and Core 2 will be standby, Core 3 and Core 4 would be just listening state.

You mean falpping happens between Core1 and Core2? But in the other post you stated that one of the interface in GEC is over-utilized. If it is so, you can have 2 active Core routers at the same time Core1 and Core3 respectively...is that the problem you are facing?

If both Core routers (Core1 and Core2) are in the same chassis, HSRP hellos need not traverse the GEC.

I guess some flooding is happening between Core switches originated by one core switch and destined to the other one. Since GEC load balancing is done by defualt on Source+Destination IP hash (If I remember well,Please confiurm it anyway from show command), this traffic always uses the same interface in the channel. You also should check for any loop (At Layer 2)

Is it possible to you to disconnect the interace that is over utilized from the channel? after disconnecting if you find some other interface being over utilized, it is certainly something to do with switch misconfiguration.

Hello,

Did you check for any potential DOS attack? Flooding can be due to that....If you have not modified the configuation and topology.

I am not sure the 100% on the GEC is over utilized but this is definitely not a good number in the switch. Please advice on the meaning of the sh channel traffic.

Thanks

You are right, the default frame distribution is S+D IP. Does that mean that the channel is not distributing the frame at all even though I have 4 ports GEC? The show channel traffic is showing one port is reaching 100% and the rest is 0% or 0.xx%.

Let say there is 120% of unicast traffic going through the GEC, the 1st port will be 100% and the 2nd port will be 20% of load?

How should I configure to distribute those frame equally on the GEC? Any good documentation on the web?

I appreciate your help.

Under normal condition in most of the environments the default loadbalancing technique achieves optimum utilization. However here you can see only one interface being used in the channel, I doubt that traffic which is causig over utilization is not legitimate one.

First of all you have to sniff the taffic with SPAN/RSPAN then we can come to some concusion.

I dont think it is a good idea to change the channel loadbalancing technique without concrete analysis on your traffic patterns.

As I told in my previuos post, traffic need not be genrated by core switch...even a DOS attack can make you to run in to such trouble......Only sniffing the traffic is next step.

I put a sniffer on and actually found out that one of the server is broadcasting a huge traffic to every port on the network. I think this is the cause of the problem because of 2 points below.

1. server on vlan 100 is using a tcp connection sending traffic to client on vlan 100 and vlan 200, however, I am receiving that traffic on other vlan as well. Some misconfiguration on the server.

2. tcp is supposed to be one-to-one communication but it is being broadcast like a real broadcast traffic. So the MFSC could not handle the huge traffic, and keeps dropping the input queue until the MSFC failover to the second MSFC. That's why I see the core flip-flopping.

I will talk to the developer for further investication.