Hi everyone, we've run into an issue with our SG300 switches recently. Since upgrading to firmware 18.104.22.168, I've noticed that they don't respond to ARP requests for minutes at a time, which causes them to be flagged as down in our monitoring system. If you are connected to the management interface on another PC (with the proper ARP table entry) the management interface works fine.
Here's what I've tried in troubleshooting:
-Check CPU util. Doesn't seem to be that since it never goes above 30%
-Check logs. Nothing in there. Even with debugging on
-Factory reset and reconfigure. Same issue
We have more than 5 of these switches so we know it's not just one bad one. It seems to be a firmware bug, honestly. Has anyone else run into this?
Hi Ruben, there were previous caveats on other releases where conditions would cause the management interface to hang. One of the supposedly fixed bugs was with IPV6 SNTP (CSCuh50141) that caused the hang. I'd recommend you open a ticket with the SBSC for further investigation to see if there's any known active bugs on it. In the mean time, the time settings would be a good place to start since there's at least 1 historical bug for it.
Please mark answered for helpful posts
I would agree with Tom that a call to support may be your best option.
One thing I would check first though is did you also upgrade the boot code? I have seen some very odd problems, especially with the various tables on the switch, when the boot code is not updated along with the firmware.
Double check that, and then as Tom said go ahead and factory default and reconfigure the switch. It will be one of the first things we will do if you decide to call in anyway.
Hope that helps, but if you do need any assistance don't hesitate to give us a call.
Adding a static ARP entry for the switch's MAC seemed to help quite a bit. I tested on 2 switches. One had only 3% ping drops and the other 0% over about an hour. Quite a bit better than the previous 30% loss. So it definitely seems like the switch is just not returning ARPs and not that the management interface is becoming unresponsive totally.
I opened a support case, but it was closed since the devices we have are no longer covered by the 1 year of tech support. I called in and was told the same thing.
It seems to me like potential bugs in firmware should be investigated regardless of support contract status, since one of the features that these switches are marketed with is free firmware updates. And the thing is that I'm trying to help Cisco out here more than I am trying to have Cisco help me... Especially considering this bug seems like it would affect many clients if it is in fact something firmware related. The switches do not return ARP requests for about 5 minutes at a time, which would have a lot of customers believe their switches are defective if they try to get to the web interface or are using a monitoring program, and probably leading to an RMA.
So I found out the issue after toiling away one long night.
Apparently the issue was the management software on the switches didn't like NetBIOS packets querying for WPAD (Web Proxy Auto Discovery) . Apparently Kaspersky AV was causing the machines to send these out pretty often. So I edited the policy that gets pushed to all of our workstations and told Kaspersky to not use a proxy at all and that solved the issue.
Maybe it's just all NetBIOS broadcasts that the management software doesn't like. I didn't get a chance to test it out.
Really weird, but just in case anyone else runs into this - that was the fix.
Sx550X, Sx350X, Sx250: PSE will Supply Power to Catalyst PSE Ports
May 31, 2016
June 5, 2017
Configure Remote Network Monitoring (RMON) Events Control Settings on a Switch through the Command Line Interface (CLI)
Remote Network Monitoring (RMON) was developed by the Internet Engineering Task Force (IETF) to support...