I have two Catalyst 6513 (with redundant Supervisor Engine 720) forming the core of the network. They are connected to each other over an 802.1Q trunk, passing all VLANs by default. They run CatOS 8.3(4) and MSFC IOS 12.2(17d)SXB6.
Each switch has a NAM-2 (WS-SVC-NAM-2) installed in slot 5. NAM software version is 3.4(1a). The NAM management port is automatically synchronized to the VLAN assigned to interface sc0 on the supervisor engine, which on both switches is VLAN 2.
Both switches' MSFC have SVI on VLAN 2 and form a HSRP address. I configured each NAM with an IP address on VLAN 2 and point their default gateway to the HSRP address. Everything is fine. The NAMs are reachable from anywhere in the network.
A week ago the switches were converted to native IOS version 12.2(18)SXF12. The migration was successful. It's been running fine thus far.
I added the command "analysis module 5 management-port access-vlan 2" on both switches to connect back to the NAMs' management port. No config modification done on the NAMs.
Issues start to arise, as follows:
1. First switch can ping to both NAMs directly (i.e. sourcing from int vlan2).
2. Second switch can ping to its NAM but can't ping the NAM on the first switch, though it can resolve its ARP.
3. The NAM on the first switch can ping to any IP address of the first switch but can't ping to any other remote IP addresses. Thus it is now no longer reachable everywhere.
I swapped both NAMs. The second NAM (now in the first switch) experienced the same issue. I have no idea what else to troubleshoot. It is supposed to be a very simple thing.
I plan to upgrade both NAMs to version 3.5(1b). I can't find a very good upgrade guide on cisco.com. Can anyone advise me:
1. How to upgrade a NAM-2 from version 3.4(1a) to 3.5(1b). Kindly point me to a technote.
2. What will happen to the NAMs' config after the upgrade? Will they remain?
3. What other patches do I need to apply?
Upgrading NAMs with native IOS is fairly straightforward. Essentially, you will reboot the NAM onto its maintenance partition, perform the upgrade, then reboot onto the application partition. The exact commands needed can be found at http://www.cisco.com/en/US/docs/interfaces_modules/services_modules/nam/nam_3_5/ICN/advcfg.html#wp1035524 .
While the configs on the NAMs should remain in tact, it would be a good idea to back them up first as we have seen some bugs that erase the flash.
There are no other patches for 3.5(1b) to download other than the optional crypto patch to enable SSH support (http://www.cisco.com/cgi-bin/tablebuild.pl/nam-crypto). Though, you might consider going to 3.6(1a) instead of 3.5.
As for your connectivity problem, since it sounds like the problem follows the NAM, you may have a configuration glitch, or hardware problem with this module. Troubleshooting would begin with the NAMs config and show tech output.
Thanks for your informative reply.
Can you kindly advise me how to back up the NAM's config? I notice in the GUI there's an FTP backup, i.e. Admin -> FTP Configuration. Is this the way?
For NAM version 3.6(1a), should I apply patch 1 followed by patch 2?
My connectivity problem does not follow the NAM. It may be the first Catalyst switch having issue because either NAM inserted into that box experiences the same issue.
I misread. I thought the first switch was fine. If the problem follows the switch, then getting the show tech on the switch will be important to rule out any problems in the config.
You can either backup the config to an FTP server via the GUI, or using the "config upload" command from the cli. you can also do:
term length 0
And copy the config directly from the screen.
Patch 2 contains the fixes from patch 1, so you do not need to apply both.
Thanks again for the pointers.
I will first try to upgrade the NAM and see if the issue persists after that. Then I'll proceed to open a Cisco TAC case.
I had installed several NAM-2 [software version 3.5(1a)] on Native IOS Cat6500 boxes before and never encountered such issue.
Issue persists even after the NAM on the first switch was upgraded to version 3.6(1a).
I strongly suspect I'm running into the following problem:
The two Catalyst 6513 (with Sup720 in slots 7 & 8) are connected to each other via a L2 etherchannel trunk (the member ports are 7/1-2, 8/1-2).
I have another similar site which I implemented few years back. It has two Cat6513 with Sup720-3B, FWSM 2.3(3), and NAM-2 3.5(1a). Native IOS is 12.2(18)SXD6. I experienced similar issue with the FWSM at first. The issue was resolved after I configured the command "fabric switching-mode force bus-mode". As a result the switching mode for FWSM and NAM-2 became bus mode. Both FWSM and NAM-2 are working fine.
For the present site, I did not configure the command "fabric switching-mode force bus-mode" because the FWSM 3.1(8) did not have problem. I was under the impression that upgrading to release 12.2(18)SXF7 or later will remove the requirement to run the traffic in bus mode. My version is 12.2(18)SXF12. The NAM-2 is not working now. The current switching mode for both FWSM and NAM-2 is crossbar.
I'd like to hear your comments and advice on what to do next.
Yeah, I ruled out this bug early on since you are running fixed code. You can try disabling crossbar switching, but I think there is still a config problem on this one switch since the other one is working.
Did you get this issue resolved? It looks like I am running into the exact same problem as you are with my NAM... I can ping it from the same VLAN but cannot connect to it from other VLAN's. The issue is only since I upgraded from CatIOS to native IOS.
The "fix" for the bug was simply to add the command to force bus mode switching for the service modules. So, once you have upgraded to a fixed version of IOS, you must configure the command, "fabric switching-mode force bus-mode" to correct the problem.
The workaround of using a default gateway external to the switch is still viable, and will not require you to force bus mode switching.
The thing is that I enabled this on our core switch running 12.2(33)SXH1 and still no dice... I can ping outwards from the NAM to the MSFC3 in the switch and to it's interfaces but I can't reach other VLAN's. I can browse to the NAM onthe same subnet it's configured for (VLAN1), however I can't from other VLAN's. It worked at one point when we were running CatOS and only have had this problem since the IOS upgrade. I know its not a routing issue because I can get to all other clients on that VLAN just fine. Additionally, the MAC is in the ARP table so it should be advertised.
Can you temporarily configure a default gateway on the NAM to point to an external router to rule our the fabric switching bug?
Setting up a external router as the gateway resolved this problem for me. I can go ahead and force the command bus and test this fix again, but I wasn't sure if it would have any adverse effects on my other service modules (CMM).
The same problem affects the CMM module (see CSCsa95653). However, it appears to be truly fixed for the CMM in CMM code. The bus mode switch should not affect the CMM.