cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
4004
Views
60
Helpful
31
Replies

CiscoWorks SNMP authentication failure

wilson_1234_2
Level 3
Level 3

I am getting the following error in several devices in our network.

The 10.1.10.25 is the Ciscoworks server, I have reset the SNMP strings several times and cannot trace where the error is coming from.

Has anyone experienced this?

May 18 14:04:15.333: %SNMP-3-AUTHFAIL: Authentication failure for SNMP

req from

host 10.1.10.25

31 Replies 31

J,

I have tried to capture on several ip addresses for the two 6509 switches and cannot get the authenication failure trap captured.

I tried the addresses in ACS and several others that are configured.

I see in the logs that the failures occur while I am capturing, but did not capture them, I have a ton of other packets, but not what I am looking for.

What I did notice is that in Campus Manager the name of some of the devices is totally different than in the other modules.

For example the two 6509 swithes are discovered as 6509-RTR-01 and 02,

In Campus Manager, they show up as 9509-RTR1, RTR2.

Is there any way to find out for sure which interface CiscoWorks is using for the request?

If it's CiscoWorks that is responsible for the polling, then the IP address will be the one listed in DCR for the device. If the IP address field is empty in DCR, then the IP address will be whatever the hostname (in DCR) resolves to. If the hostname field is empty in DCR, then the IP address will be whatever the display name (in DCR) resolves to.

Of course, this will only capture the CiscoWorks traffic. The actual trap/syslog sent from the device may come from a different address. However, you should be able to determine that address from the log in which you see the AUTHFAIL message. You will need to filter on both addresses to see the whole picture.

J,

Thanks for the reply,

First off, sorry, but what is DCR?

I can see in the switch logs that the authentication failure is definately coming from the Ciscoworks server address.

It is like one of the modules is trying to poll the device and is misconfigured.

DCR is the Device Credential Repository. Go to Common Services > Device and Credentials > Device Management. This is the master list of devices and credentials for all CiscoWorks applications. No CiscoWorks application will manage a device that is not listed here.

J,

here is what I see and maybe you could give me your thoughts on this:

The devices (the switches) are in DCR with the same ip address and hostname as in the Switches group on the ACS. I have reset the credentials here.

It looks like all devices are showing up here with the proper Display Name and IP Address that is in ACS.

When I do a "Device List' report, they show up with the correct IP Address and Display name, but the hostname is incorrect.

When I do a "Devices not configured in ACS" the switches show up here, but with an IP Address which is a VLAN interface on the switch, but not configured anywhere that I see in CiscoWorks (maybe discovered?, I noticed the discovery settings are to use the IP address, not loopback interface, but this is not the lowest address on the switch).

The address that is in the "Not Configured in ACS" report is the one that CiscoWorks is using to SNMP poll the switch, I was able to see nothing but successful connections in the capture, even while capturing packets and seeing an authentication failure in the switch log while I was capturing. They did not happen at the same time, so the failure is coming from the Ciscoworks server, but not on that address.

One of the switches is being successfully polled for User Tracking information, the other is not.

I have gotten the second switch to have User Tracking info pulled before by changing Authenticated user in the Device Credentials, but it stopped for some reason.

First things first: your DCR is a mess. You have duplicate devices, but you can't see those duplicate devices because you're integrated with ACS, and those IP addresses aren't known to ACS. I think the best solution for you is to not use Discovery. All your devices are already known to ACS, and discovery is only confusing things. You don't need discovery to use RME, Campus Manager, etc.

I think what would be best at this point is to break the ACS integration (temporarily), then clean up all of the duplicate devices in DCR (basically remove all of those entries that are now showing up in the Devices not in ACS report). Once you have a good DCR, delete all of the scheduled Discoveries, and re-establish ACS integration (when doing this DO NOT check the box to register applications with ACS).

At this point, your DCR will be clean, and you will only be managing the devices by the correct IP/hostname with the correct credentials. If the AUTHFAIL problem persists at this point, capturing a sniffer trace should be easier given that there shouldn't be any "mystery" devices in the Devices not in ACS report.

Thanks j,

I agree it is a mess, I don't think it has ever worked properly. Just from the information I have seen on your posts, it was not set up correctly to begin with.

But, I was able to find the interface that is producing the authentication error.

I found it by going to:

RME->Normal Devices->

selecting the device and then report, there is an errors section, clicking errors shows the interface which is generating the error. The packet is actually a syslog packet and doesn't really tell me anything.

I wanted to start over with cleaning up the ACS devices first, then cleaning up the Ciscoworks stuff.

My questions:

How do I break the integration?

How do I remove the "Not Showing" devices?

All I see is a report, i didn't where to remove them.

To break the integration, go to Common Services > Security > AAA Mode Setup, and set the Type back to Non-ACS. You will then need to restart dmgtd. When you do, you will see all of those devices that were showing up in the report in DCR. You will then be able to remove them.

J,

I am pretty sure I found the source of the SNMP authentication errors:

I stopped the integration and was able to delete the items as you suggested.

I also deleted the scheduled discovery jobs, but I did not apply the settings, so the devices were discovered again.

I saw that one of the devices is discoverd in a subnet different than anything I have seen configured in CiscoWorks.

I see an SNMP packet going from the server to the device on that subnet, trying to get systemuptime with community name public (incorrect).

The source port is 39542 from the server. Do you know if this is any of the Ciscoworks modules using this port?

The SNMP client will use a random UDP port as a source. This is most likely Discovery, though. Like I said, in your case, you should consider stopping Discovery, or at least tightening down the discovery filters.

Note: Discovery does have a default SNMP community string it will use if it's made to discover a device that is not already in DCR, or specified in the discovery SNMP Settings. This string is public by default, and can be changed in NMSROOT/campus/etc/cwsi/DeviceDiscovery.properties. But if Discovery is using this string, you should probably consider changing your discovery filters, or adding more entries to your discovery SNMP Settings.

J,

It looks like things are getting a little straightened out.

I have a clean list (0 items) when I do a "Items not in ACS".

I reorganized the ACS devices and removed everything from the DCR list, then re-entered the devices by the names that were in the ACE.

I tried to do discovery but it seems that there are problems when I do that, even when making the filter more restrictive.

After doing the discovery, I used your procedure again to break the integration and removed everything.

My question are:

1. It looks like everything that was entered into DCR manually is working fine, but I see the two 6509 switches showing up in Campus Manager as the lowest IP Adress of the device for the Device name and not the name entered in the DCR.

RME is using the same name as DCR.

2. Where do I set up a new job to archive configs? I can't find where to creat a new job?

Thanks for all of the help.

1. Where in Campus?

2. The system config collection job can be setup under RME > Admin > Config Mgmt > Collection Settings. Ad hoc jobs can be created under RME > Config Mgmt > Sync Archive.

1. When I go to Campus Manager Administration, on the Home tab, under System Status it shows the results of the Device Discovery, Data Collection and User Tracking Acquisition.

The device discovery is showing 41 devices from the old previous discovery (it is dated).

The data collection is showing 24 devices (that I manually entered), if I click on the "24 devices" it takes me to the device list, where the 6509 Device Name is not what is in DCR.

Also on the User Tracking, the end hosts number is correct now, but it shows the 6509 switches Device Name as the lowest interface IP Address, not the name entered in DCR.

Is it possibly remnants of the old discovery? If so, how do I get rid of those 41 devices listed under the "Device Discover"

Also I am showing 0 IP Phones on the "User Tracking Acquisition". The phones are showing up under "End Hosts", is this normal?

For the Archive management, I see a job created by someone else that looks like it has never been successful to copy configs. It is scheduled to run every night. Do I just delete this job and set the schedule (without entering a device list), to archive configs?

If you have DCR the way you like it, go ahead and reinitialize your ANI database:

NMSROOT/bin/perl NMSROOT/bin/dbRestoreOrig.pl dsn=ani dmprefix=ANI

The "41" number next to Discovery won't change, but that's purely cosmetic. At this point, the Data Collection and User Tracking data is gone. Then, perform a new Data Collection, and check the results. They should agree with what you have in DCR. Once Data Collection is complete, a new User Tracking acquisition will start automatically.

IP phones will always show up under End Hosts. A phone has to be an end host before it can be considered an IP phone. Then, if it's a Cisco IP phone, and the Call Manager to which it is registered has been properly Data Collected, then it will be added to the phone table in User Tracking.

Since the phones are only showing up as users, your CCM is most likely not properly Data Collected, and showing up as green with the proper icon on the Topology Map.

For your Config Archive question, just reschedule the periodic collection job using RME > Admin > Config Mgmt > Collection Settings. But in order to avoid a failure, verify all the credentials are correct in DCR for your devices. One failing device will mark the job as failed.

J,

It looks like the Data Collection is correct, although I am not 100% sure of every port.

What exactly will the re-initializing of the ANI database do?

What are the chances of corruption when doing this?

and is the NMSROOT part of the command or the C:\ of the CiscoWorks Server?

Also, what about defragmentation of the hard drives, is this ok to do with the Cisco Databases?