c3750 not manageable

Answered Question
Nov 27th, 2009

Hi All!

I started experiencing a weird problem with one of my 3750s today. It seems that connectivity to the managment IP has become sporadic...

I can ping it and get maybe 2-3 responses, then a couple minutes of no responses. However, the switch is still passing traffic normally, users connected to it are not experiencing any symptoms.


Switch is running IOS 12.2(44)SE2 and is connected via 802.1q trunk to 2801 with only 2 VLANs. There's really not a lot of traffic going through this switch. I can't connect to the switch from the directly connected router or elsewhere on my network.


I sat with my finger on the "Get Tree" button in my SNMP MIB Browser, to try gathering some data when it became available to begin narrowing down where the issue could be. Here's what I've found / checked so far:

  • CPU (via CISCO-PROCESS-MIB):
    • around 5-10% for all 3 values (1 sec, 5 sec etc)
  • Memory:
    • There is pleanty of memory free, nothing out of the ordinary
  • Syslog:
    • Log messages are not getting to Syslog server, but the few I've pulled via SNMP look normal. Just interfaces state changes mostly, which is normal at this location.
  • Interfaces(via MIB-2 interfaces table):
    • no input or output errors on uplink port or any other ports for that matter, looks very clean.
  • ARP
    • Verified ARP on the 2801 is staying the same for the IP of the switch.
  • Other Observations:
    • I was able to connect for a few seconds and noticed NTP wasn't syncing.
    • When the switch is replying to Pings, response times are high (300 - 1900[?!]ms). I verified those high ping response times were not due to latency on the location's WAN connection (T1 is at low utilization)
    • Issue started today at some point. No changes were made to the switch or environment at that location.


I'm not able to reboot the device at the moment, I hope to do that this weekend, but I was curious if anyone has seen anything similar and had a fix other than rebooting. I'm not even sure if rebooting will help. Its odd that I can't manage the switch, but it is still passing voice and data traffic like it was yesterday. Thanks

Correct Answer by glen.grant about 7 years 3 months ago

   We saw something kind of similar in a 6 switch stack .  But our problem appeared to be the  stackmaster had a slow memory leak and occassionally we would not be able to manage the stack for awhile then it would come back  . Switch and port  utilizations once we got in were low across all 6 switches .  I'm speculating in our case because this box is used  as a jump box  into the rest of the net and maybe  it was not releasing memory resources once we got out . We were also getting low free memory warnings from a snmp monitoring tool. We reloaded the entire 6 switch stack and so far we have not seen a reoccurence, think we are at 12.2.35SE something .

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4 (2 ratings)
Loading.
Leo Laohoo Fri, 11/27/2009 - 13:55

Can you post the config of your router and the switch?  It sounds like the switch's default-gateway could be incorrect.

Elly Bornstein Fri, 11/27/2009 - 16:55

Everything you described sounds like a control plane issue, which is usually caused by high CPU or congestion.


1) manually look at the CPU usage and history on the box

2) do a 'show interface | i drops|line' and look for any output drops on uplinks

3) follow the high cpu troubleshooting guide for 3750 to gather some commands off the console if you plan to open a TAC case, so they have some something to go on.

4) really would need to examine a 'show tech' to figure this out and wouldn't hurt to do a SPAN session of the CPU.


All this stuff is in the high CPU guide for 3750s:

http://www.cisco.com/en/US/products/hw/switches/ps5023/products_tech_note09186a00807213f5.shtml


HTH,


Elly

Correct Answer
glen.grant Fri, 11/27/2009 - 16:56

   We saw something kind of similar in a 6 switch stack .  But our problem appeared to be the  stackmaster had a slow memory leak and occassionally we would not be able to manage the stack for awhile then it would come back  . Switch and port  utilizations once we got in were low across all 6 switches .  I'm speculating in our case because this box is used  as a jump box  into the rest of the net and maybe  it was not releasing memory resources once we got out . We were also getting low free memory warnings from a snmp monitoring tool. We reloaded the entire 6 switch stack and so far we have not seen a reoccurence, think we are at 12.2.35SE something .

Actions

This Discussion

Related Content