6500 Sup2/MSFC2 Freeze for no reason

Unanswered Question

Hi Forum,


I have a 6506-E unit with dual Sup2/MSFC2 running Native IOS 122-18.SXF8 on SSO mode. Every one in a while the units freeze for a very short period (seconds) but leaves no record on the logs/cli outouts.


We have decreased the timers for the snmp status polls to 30-60sec but still no luck. Are there any commands to help me troubleshoot this issue?


Have you seen this issue or have any ideas of what can be done to narrow down the source cause of the issue.


I have other chassis with identical configuration behaving ok.


Many thanks


G.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Giuseppe Larosa Tue, 12/01/2009 - 03:32
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Goncalo,


exactly what do you mean with "the device freezes" ?


if you mean in a telnet session it can be a sign of an high CPU usage use


sh proc cpu | inc util


sh proc cpu sorted 1min


Hope to help

Giuseppe

Hi Giuseppe,


Thanks for your response.


I have not seen anything like this in the past. not with out leaving no trace or error message.


All system logs and CLI show commands looks clean.


We use the 6500 with dual Sup as a L2 switch. We lost connections for a very brief period (<30sec) on traffic going via this switch.


We know it is the switch who stopped forwaring traffic as we have multiple devices connected to it and all failed at the same point.


The SNMP CPU/Mem utilisation stats looks clean - very low utilisation.


Regards


Goncalo

Giuseppe Larosa Tue, 12/01/2009 - 06:35
User Badges:
  • Super Silver, 17500 points or more
  • Hall of Fame,

    Founding Member

Hello Goncalo,


>> We lost connections for a very brief period (<30sec)


First idea is spanning-tree timing is similar to STP timers.


What you see in show spanning-tree detail, there should be a line reporting last time that STP has been executed?



Hope to help

Giuseppe

Sorry for the late update, I forgot to post update on this thread.


We investigated several things and narrow it down to MSFC been over utilised because it was processing to much "unwanted" multicast and broadcast.


The way to work out what traffic is flooding your MSFC is creating an SPAN session from SP/MSFC to sniffer port.


See procedure for Sup 720 bellow (on Sup2 or modules with DFC procedure is a little different):


From RP (connect to SP):


1. remote login switch


Once you are on SP:


2. test monitor add 1 rp-inband both

3. test monitor del 1 rp-inband both


Back on RP:


5. create a monitor session with source using an admin down/unused port - using same monitor session number as SP monitor

6. create a destination monitor session to sniffer with same session number a SP session


Once you are capturing launch the sniffer and you can see top talker (in my case multicast and broadcast). You will then need to filter any unwnated m/bcast o r do a little re-design as required.


Good luck and thanks for your feedback.


Goncalo

Actions

This Discussion