SRW2024 stops passing traffic intermittently

Unanswered Question

Hi all.  I have 2 srw2024 attached using mm fiber and gbic adapters.  One of the switches stops passing traffic for 3 minute intervals several times per day.  yesterday, it went down 10-15 times.  I enabled all possible logging last week and found the following in the event log every 5 minutes.

%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.

I just power cycled both switches last night and had only 1 3 minute outage today.

has anyone seen this issue?  I am using default stp and vlan configs...just a generic switch config and running firmware version 1.2.1.

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
MikeLight Tue, 05/12/2009 - 08:01

Hi,

I doubt very much the log message has anything to do with the traffic stoppage. This is a "Warning" level

message, indicating the switch CPU is seeing more UpNP messages than it can handle - so the CPU is warning

about it.

Traffic forwarding is done by the switching chip, and should not be affected by UpNP working or not.

In fact, since your log does not show a system reboot, traffic should flow even if all the software of the switch

was to freeze, all at once ....

I'd look for some Network  issue, or mis-configuration problem.

The best is to catch the switch "in the act", while it is not forwarding, and figure out if traffc is
really is arriving at the switch and not being forwarded (e.g. by looking at the port-statistics).

I'd advise you to look at using RMON to track the port-traffic counters (in and out) on some ports
that are in constant (or at least frequent) use, and create an event/alarm if traffic flow goes really low.

(RMON even has statistical sample gathering that can be useful)

Good luck ....

Mike

SRW208, SRW2016 %UPNP-W-DBOVERFLOW:

I'm getting way too many messages out of an SRW208 and an SRW2016 on the same network.  It only happens on this one network.  I've been dealing with this issue for a long time and can't find any resolution. 
I can imagine that turning off switch SNMP warnings might fix it at the system level.  But it seems odd there's no resolution at the switch level.
I have read suggestions including setting an ACL to block UPNP traffic at the switch but one would have to ask: "Why should I block ANY traffic just to get the switch to behave?".

Has anyone found a solution to this or dealt with it?

The only solution I've found is to do periodic switch reboots.  That does work briefly but it's a PITA to repeat rather often.

I did use wireshark to monitor traffic between the two switches and find SSDP traffic to 239.255.255.250 (as expected) from virtually *all* the computers in this small network of Windows 7 machines.

Any help on this?

Moderator Tue, 05/12/2009 - 12:20

Hello,

Thanks for your post. The UPNP  log message is informational and should not cause the traffic stop. We could not identify the root cause. Could you please provide us with some more information?


1)       How do you notice and measure the traffic stop and the interval?

2)       What  kind of traffic do you run if any? Packets size and the traffic  rate?

3)       Can you  send the switch configuration?

Thank you,

Cisco Moderation Team

zentechconsultants Mon, 05/18/2009 - 13:37

G'day,

We have just noticed this same problem on a SRW2048 fw version 1.2.2, hw 00.03.00, boot verison 1.0.0.04.

While it may have happened before, the outages are so small it could easily have gone unnoticed.

The logs are ful of UPNP Buffer errors that go away as soon as you reboot.

The outages yesterday ranged from approx 9 seconds to 60 seconds.

Switch is virtually standard out of the box, no fancy VLAN configs etc, only SNMP and has been in production for months.

The config will not be available due to privacy concerns and the fact you cannot "sanitise" the config in plain text.

If you are able to provide a config Text analyser we could do something.

After the reboot last night, no entries in the logs apart from the usual Up/Down and logins.

BTW, this is not a "NEW" error, it has been reported quite a few times before:

http://forums.linksys.com/linksys/board/message?board.id=Switches&thread.id=142&view=by_date_ascending&page=2

So it really should have been fixed with a FW update by now.

Rgds Ben

Likewise for us here - we had a SRW2048 that ended up with some ports going bad, so we got a new one from CDW. This unit's been in place for maybe a month or two and developed this issue over the weekend.

Serial Number
Model NameSRW2048
Hardware Version00.03.00
Boot Version1.0.0.04
Firmware Version1.2.2


I had this issue one in the past on the previous device, but it finally cleared up after several reboots and much hair pulling.

A fix would be great - this is a contributing factor in the problem we're having with our workstations taking 30-60 minutes to log on to Windows, and, frankly, the partners aren't very happy about it.

I, too, would be happy to do anything I can to help get this issue resolved.

On a side note - is there a newer hardware version? I seem to recall that the device that developed the bad ports was actually 'newer' than this one - how do I get the newest hardware version?

edit:

Our configuration is also pretty standard - the SRW2048 is our primary switch, connected to all of our servers and workstations. It's got a connection to our WG x550e firewall, another line to our RV082 router for our second internet connection, and a line to our WRVS4400N for our wireless clients. Config-wise, however, it's pretty standard. I've attached a backup copy of our config file for review.

See the following log excerpt:

1214746525422-Jun-2009 13:28:23Warning
%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.    
2214746525522-Jun-2009 13:28:23Debug
%OS-D-MEMORY: No free memory:  size = 364, pool = UPnP, task = UPNP    
3214746525622-Jun-2009 13:27:41Warning
%COPY-W-TRAP: The copy operation was completed successfully    
4214746525722-Jun-2009 13:27:40Informational
%COPY-I-FILECPY: Files Copy - source URL flash://startup-config.ber destination U
RL HTTP://0.0.0.0/    
5214746525822-Jun-2009 13:27:40Warning
%COPY-W-TRAP: The copy operation was completed successfully    
6214746525922-Jun-2009 13:27:40Informational
%COPY-I-FILECPY: Files Copy - source URL flash://startup-config.ber destination U
RL HTTP://0.0.0.0/    
7214746526022-Jun-2009 13:23:22Warning
%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.    
8214746526122-Jun-2009 13:23:22Debug
%OS-D-MEMORY: No free memory:  size = 364, pool = UPnP, task = UPNP    
9214746526222-Jun-2009 13:19:32Warning
%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.    
10214746526322-Jun-2009 13:19:32Debug
%OS-D-MEMORY: No free memory:  size = 284, pool = UPnP, task = UPNP    
11214746526422-Jun-2009 13:19:25Warning
%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.    
12214746526522-Jun-2009 13:19:25Debug
%OS-D-MEMORY: No free memory:  size = 284, pool = UPnP, task = UPNP    


Regards,

-d

Message was edited by: [email protected]

Attachment: 
ssmalley77 Thu, 04/15/2010 - 17:52

Did you ever have any solution to your problem?  We have just recently installed 2 x Linksys SRW2024 switches in a clients site and they intermittantly have PC's drop off the network and we are seeing similar issues.  Does the update to firmware 1.2.2b fix this?  should we send them back to linksys and get another switch?

Actually, the answer is sort of yes and no.

I finally broke down and called Cisco support about the issue, and the first tech I spoke with couldn't find the device in the system, so I had to forward them a copy of my original invoice so they could verify that it was still in warranty. After doing so, I called back to confirm receipt, and somehow the fax got lost, so I had to re-fax the invoice to a second technician. When I called back the third time to confirm receipt, that technician couldn't find the invoice either, but he was able to pull my device right up in their system, saw it was under warranty, and apologized for the first two techs making me forward the invoice - he doesn't know what they were looking at.

At any rate, after a decent discussion with the third tech about the issue, he put me on hold and confirmed the symptoms with one of the senior technicians - they'd actually just had another call about a similar issue, but he wanted to make sure he relayed the information correctly.

Per the senior tech, these errors:

1

2147465254

22-Jun-2009 13:28:23

Warning

%UPNP-W-DBOVERFLOW: UPnP retransmission DB overflow.    

2

2147465255

22-Jun-2009 13:28:23

Debug

%OS-D-MEMORY: No free memory:  size = 364, pool = UPnP, task = UPNP   

are essentially the result of a broadcast storm caused by repeated UPnP multicasts that the switch doesn't know what to do with. My apologies, but I don't recall for certain if the tech said that the switch didn't support UPnP at all, if it did support UPnP forwarding and it just couldn't figure out where to send the packets, or if I had something configured wrong in my setup (it's basically still set to factory defaults). Either way, the UPnP broadcasts are getting to the switch and filling up the buffer faster than it can be cleared out. Once the buffer is full, the switch runs out of memory, causing all sorts of network issues - packet loss, connectivity drops, and so-on.

That's the "yes" part of the answer. The "no" part for me was implementing the actual fix. The tech offered two potential solutions:

The first was to isolate and identify which device(s) on the network were responsible for the broadcasts and disabling UPnP in their configuration. This could be as "simple" (depending on the size of your network) as logging in to your various devices, checking their setup, and disabling UPnP in their configuration. I tried going this route, disabled UPnP anywhere there was an option to do so, but was unable to find the offending device. If you have the same issue, the recommendation was to configure Port Mirroring on the switch, one port as a time, and use something like WireShark to capture the packets from the mirrored port. Once you're on the right port, you should see the broadcast go out, which will tell you which device they're coming from. You then disable UPnP on that device and should be all set. If that device doesn't have the ability to disable UPnP, you'll have to use the second option offered by the tech.

Option #2 was to configure the IP based ACL to deny the multicast packets. Per the tech, this can be done by adding an entry to the switch ACL that denies any traffic to 224.0.0.0/24. You may need to tweak the range being blocked, however, depending on what protocols you're using - blocking the whole /24 will also block OSPF, RIPv2, IGMPv3, mDNS, and LLMNR, as well as some others. This is where I got stuck, not because we're using any of those protocols, but because I couldn't figure out how to configure the ACL correctly - thus the "no" part of my answer.

If you end up going the ACL route, would you do me a huge favor and post back with how you configured it? It's either that or I'll have to call Cisco support back and see if I can get a copy of the case notes, but I no longer have my case number, so that may not be possible.

Regarding the firmware update, I generally try to stay as current as I possibly can, just in case. It looks like there are multiple hardware versions for your SRW2024, so you'll have to determine which firmware to load - based on the release notes, it looks like v1.2.2b was released on 3/3/2008, and v1.2.2 was released on 1/28/2007, even though the dates shown on the website are 6/10/10 and 11/19/09 respectively. You can download the latest firmware here: http://www.cisco.com/en/US/products/ps9989/index.html.

HTH and Best of Luck!

Actions

This Discussion