I have a Cisco 3550-12G switch experiencing high CPU loads. For a long time (years?), CPU load has been minimal (a couple of percent; this is all graphed by our monitoring systems). About 4 weeks ago, the switch rebooted, possibly due to some power work being done in the same building. Ever since then, CPU is way above baseline and is causing alarms with our monitoring.
IOS is IP Services 12.2(25)SEE2. See attachment for show proc cpu output.
A few minutes ago 5 second CPU was about 76%/36% with the HMATM Learn proc taking about 36% of the CPU. Now it is 66%/39% with 25% going to this process.
Again, it varies, but it is well above what baseline was before the reboot.
I saw that, but nothing in there referenced the HMATM Learn proc. (It was included in sample outputs, but CPU was always 0% and not mentioned as a cause.)
I have received over 14,000,000 broadcasts on one of the interfaces (a link to a downstream L2 4003). However, I don't know if that was normal or not prior to the reboot. I learned that HMATM is the Hardware MAC Address Table Manager. But is there really anything to "manage" when a broadcast comes through? I watched the MAC table for a period of time (every 10 or 15 seconds for about 10 minutes) and never saw a significant change in the number of MAC addresses total on the switch. Total MAC addresses for the system ranged from about 96 to 108.
I work better by knowing what I'm working with. Exactly what does this HMATM Learn proc do? What are the conditions that trigger it to do something? From the bit that I've read, it seems that it adds and removes MAC addresses from the hardware table when it sees a new address or when an address expires. If I'm not seeing huge changes in the table, then why else would it be using so much processor?
Finally, I don't know any of our devices that would be using SNAP encapsulation. I could take a network capture and see, but I doubt that will get me anywhere.
All of the other reasons mentioned in there should be showing different symptoms in the proc cpu output if they were applicable.
We are using HSRP. However, I fall back to the fact that we were using HSRP prior to the reboot and it wasn't doing this.
The reload was due to a power cycle. At least, the switch reports it returned by power-on.
We activated a new core switch with new sup720's within the last few weeks. This core switch is the switch with which this 3550 partners in HSRP. However, everything was fine, literally, until it rebooted. Our graphs show it was immediately at that point when CPU use left baseline and has not returned. And the new core switch was online and partnered in HSRP before this reboot.
I may do a capture on this port; perhaps that will show me a misconfigured host. But this is a radiology network (I work for a hospital) and new servers don't go on it without us helping to assign ports, IP addresses, etc. So I'm reasonably certain it isn't a new server.
I've asked you about HSRP because on C6509 we have noticed that they become crazy with cpu up to 100% if two are claiming to be Active at the same time for the same group. This happened two times: first for a misconfiguration that joined two VTP domains making to communicate two vlans that had been designed separated, the second one for a problem involving monitor sessions and FWSM.
Verify that both agree on active router identity.
If the broadcasts are ARP messages they need to be processed by the CPU and this explains the increase in cpu usage.
We are pleased to announce availability of Beta software for 16.6.3.
16.6.3 will be the second rebuild on the 16.6 release train targeted
towards Catalyst 9500/9400/9300/3850/3650 switching platforms. We are
looking for early feedback from customers befor...
Introduction Featured Speakers Luis Espejel is the Telecommunications
Manager of IENova, an Oil & Gas company. Currently he works with Cisco
IOS® and Cisco IOS XE platforms, and NX to some extent. He has also
worked as a Senior Engineer with the Routing P...
In this session you can learn more about Layer 3 multicast and the best
practices to identify possible threats and take security measures. It
provides an overview of basic multicast, the best security practices for
use of this technology, and recommendati...