FWSM 4.0.5 fails to pass traffic

Unanswered Question
Feb 14th, 2010

Dear friends,

There are a pair of FWSM modules running in multi-context routed mode and in Active/ Active inter-chassis failover.

The FWSM ver is 4.0(5) and 6500 version is 12.2(33)SXI1 with Sup720-3B.

In two months, the primary FWSM module has brought up some weird issues as follows:

1. The first time was when the issue occured, all the vlans on the fwsm were up but i was not able to ping any hosts in any vlan. I did clear xlate but it did not help. I did a no failover active group 1 and then everything was normal. When i again said failover active group 1, i lost connectivity. I had to reload the module and then after that, everything was fine.

2. The second time it happened was just a couple of days back. But this time, the vlans were admin-down. I asked the client if there were any changes and he said none. I then asked him if he had tried doing a no shut and he said that in that moment of pressure, he did not think of it. He just shutdown the module for the other module to take over. When i went to the site, we just reloaded the module and came up fine when it became active this time.

I did not have any logs to analyse the reason for the failure except an error message in Admin context saying "Lost communications with management vlan" which was not really helpful.

Has anyone come across FWSM vlans going admin-down suddenly without any changes in this particular version 4.0(5)?

When i looked into the Bug navigator, i just found one bug somewhat related to our config since dns-guard and sysopt np completion-unit are both enabled in admin context in our case..

CSCsw79921 Bug Details Bug #4 of 9 | < Previous | Next >
FWSM stops passing traffic when completion-unit enabled

Symptom:

The FWSM may stop processing traffic and be unmanageable except through a session from the switch.


Conditions:

The error may occur when dns-guard and the completion-unit are active simultaneously. The following are symptoms of this defect.

- show conn output may show the following error:
ERROR: np_logger_query request for Show Connections failed Network Processor 2 connections

- show np blocks command indicates many Threashold 0 drops and show np pc shows all the threads are in use for one of the NPs.

Ex:
------------------ show np blocks ------------------

MAX FREE THRESH_0 THRESH_1 THRESH_2
NP1 (ingress) 32768 112 17338 48 152
(egress) 521206 509654 0 0 0
NP2 (ingress) 32768 32768 0 0 0
(egress) 521206 521205 0 0 0
NP3 (ingress) 32768 32768 0 0 0
(egress) 521206 521206 0 0 0

------------------ show np pc ------------------

THREAD:PC(NP1/NP2/NP3)
0:404d/0000/0000 1:404d/0000/0000 2:404d/5bd6/0000 3:404d/5bc2/0000
4:404d/0000/0000 5:404d/0000/0000 6:404d/0000/0000 7:404d/0000/0000
8:404d/0000/0000 9:404d/0000/0000 10:404d/0000/0000 11:404d/0000/0000
12:404d/0000/0000 13:404d/0000/0000 14:404d/0000/0000 15:404d/0000/0000
16:404d/0000/0000 17:404d/0000/0000 18:404d/0000/0000 19:404d/0000/0000
20:404d/0000/0000 21:404d/0000/0000 22:404d/0000/0000 23:404d/0000/0000
24:404d/0000/0000 25:404d/0000/0000 26:404d/0000/0000 27:404d/0000/0000
28:404d/0000/0000 29:404d/0000/0000 30:404d/0000/0000 31:404d/404d/0000

Workaround:

There are two workarounds.

1) disable the completion unit.
Ex: no sysopt np completion-unit

2) disable dns-guard (4.x and later only)
Ex: no dns-guard

Has anyone come across a similar issue to this?

Thanks a lot

Gautam

I have this problem too.
1 vote
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.

Actions

This Discussion