cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1111
Views
3
Helpful
10
Replies

DFM device discovery stuck at 90%

c.hulcher
Level 1
Level 1

LMS 2.6, DFM 2.0.10

Any ideas as to why the DFM discovery process would stick at 90% for all devices not in the unknown or questioned categories?

Services are all running normally and I've reinitialized the databases but still no luck.

1 Accepted Solution

Accepted Solutions

I can post the Release Notes from the bug, but direct database access is something that should only be done with TAC's assistance:

Symptom:

Group administration UI shows a different set of root group names for the various application hierarchies. For example, when you launch Common Services -> Groups -> Group admin, you will see CS@ for the CS group hierarchy, and RME@ for the RME group hierarchy. Because of this mismatch, DFM discovery of devices is stuck at 90%.

Conditions:

This happens when the hostname of the CiscoWorks server is changed from lowercase to uppercase or the vice versa, followed by a daemon manager restart.

Note: This issue will not appear if the hostname is changed to a different string.

Workaround:

1. Stop Daemon manager.

2. Delete the OGS tables from the all the databases. CS, DFM, Campus OGS tables are present in CmfDb and RMEOGStables will be present in RmeDb. Note: The user defined groups will be lost by deleting the tables.

The following are the commands:

In cmf.db:

drop table csogsgrouppropertiestable;

drop table csogsgroupcachetable;

drop table csogstagtable;

drop table csusergroupassociationtable;

delete from dbversion where component='cs';

drop table dfmogsgrouppropertiestable;

drop table dfmogsgroupcachetable;

drop table dfmogstagtable;

drop table dfmusergroupassociationtable; delete from dbversion where component='dfm';

drop table campusogsgrouppropertiestable; drop table campusogsgroupcachetable; drop table campusogstagtable; drop table campususergroupassociationtable; delete from dbversion where component='campus';

In rme.db:

drop table rmeogsgrouppropertiestable;

drop table rmeogsgroupcachetable;

drop table rmeogstagtable;

drop table rmeusergroupassociationtable; delete from dbversion where component='RME';

3. After this, open NMSROOT/etc/ogs/applications.reg in notepad and empty the contents (remove all the lines) and save the file.

4. Start Daemon manager.

View solution in original post

10 Replies 10

Joe Clarke
Cisco Employee
Cisco Employee

At 90%, DFM has handed off the device to the DFMOGSServer for classification. If the devices hang there, there is a problem communicating with the DFM server. What databases did you reinitialize?

dsn=dfmInv dmprefix=INV

dsn=dfmEpm dmprefix=EPM

dsn=dfmFh dmprefix=FH

Deleted NMSROOT/objects/smarts/local/repos/icf/DFM.rps

Excerpt from TISServer.log:

14-Mar-2008|13:31:21.000|ERROR|TISServer|WorkerThread0|{TISOGSProxy}||.|getDeviceListForGroup:/CS@ciscoworks/System Defined Groups/Routers fails with reason:Group: /CS@ciscoworks/System Defined Groups/Routers does not exist

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|group:/CS@ciscoworks/System Defined Groups/Routers,members:[]

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDevice}||.|loadPartitionInfo

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|check isDevicePopulated from caller:updateDeviceStatus,updateHangingDevices

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDevice}||.|loadPartitionInfo

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|probe ogs for group membership of:/CS@ciscoworks/System Defined Groups/Switches and Hubs

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSProxy}||.|getRealDeviceListFromGroup:/CS@ciscoworks/System Defined Groups/Switches and Hubs

14-Mar-2008|13:31:21.000|ERROR|TISServer|WorkerThread0|{TISOGSProxy}||.|getDeviceListForGroup:/CS@ciscoworks/System Defined Groups/Switches and Hubs fails with reason:Group: /CS@ciscoworks/System Defined Groups/Switches and Hubs does not exist

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|group:/CS@ciscoworks/System Defined Groups/Switches and Hubs,members:[]

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|done updating hanging devices,# of completed-devices/total:0/319

There is a procedure to fix this documented in CSCsb46759. It involves direct database access, and it would be best if you opened a TAC services request, and had them walk you through it.

Is there any chance you can post or email the process? Unfortunately I am working on a customers LMS installation and don't even know if they have a current contract for the software.

I can post the Release Notes from the bug, but direct database access is something that should only be done with TAC's assistance:

Symptom:

Group administration UI shows a different set of root group names for the various application hierarchies. For example, when you launch Common Services -> Groups -> Group admin, you will see CS@ for the CS group hierarchy, and RME@ for the RME group hierarchy. Because of this mismatch, DFM discovery of devices is stuck at 90%.

Conditions:

This happens when the hostname of the CiscoWorks server is changed from lowercase to uppercase or the vice versa, followed by a daemon manager restart.

Note: This issue will not appear if the hostname is changed to a different string.

Workaround:

1. Stop Daemon manager.

2. Delete the OGS tables from the all the databases. CS, DFM, Campus OGS tables are present in CmfDb and RMEOGStables will be present in RmeDb. Note: The user defined groups will be lost by deleting the tables.

The following are the commands:

In cmf.db:

drop table csogsgrouppropertiestable;

drop table csogsgroupcachetable;

drop table csogstagtable;

drop table csusergroupassociationtable;

delete from dbversion where component='cs';

drop table dfmogsgrouppropertiestable;

drop table dfmogsgroupcachetable;

drop table dfmogstagtable;

drop table dfmusergroupassociationtable; delete from dbversion where component='dfm';

drop table campusogsgrouppropertiestable; drop table campusogsgroupcachetable; drop table campusogstagtable; drop table campususergroupassociationtable; delete from dbversion where component='campus';

In rme.db:

drop table rmeogsgrouppropertiestable;

drop table rmeogsgroupcachetable;

drop table rmeogstagtable;

drop table rmeusergroupassociationtable; delete from dbversion where component='RME';

3. After this, open NMSROOT/etc/ogs/applications.reg in notepad and empty the contents (remove all the lines) and save the file.

4. Start Daemon manager.

Would renaming the server to something else fix the problem? If I did that and then put the name back to the original, would to problem come back?

It might, but this may introduce other problems as a hostname change is a non-trivial thing. If you want to try changing the hostname, you need to be aware of the hostname change procedures:

http://www.cisco.com/en/US/docs/net_mgmt/ciscoworks_common_services_software/3.0/user/guide/diagnos.html#wp1078582

I just looked in Group Administration, all of the hostnames within the GUI are in lowercase. Any chance the GUI is forcing it that way and the database has the different case?

That's sort of what's happening. What we typically see is that the DFMOGSServer is expecting to find the groups in lowercase, but Common Services has them in upper case (or vice versa), and this leads to DFM not being able to find the groups in question.

There is one other thing you can try before the SQL queries. Try running NMSROOT\bin\perl NMSROOT\MDC\tomcat\webapps\triveni\WEB-INF\classes\rmkilnerogstable.pl . It will clear out just the DFM OGS info which may be enough to fix this.

I was able to find the incorrect case within the dfmogsgrouppropertiestable, the CS@ entries within that table were all off by one capital letter.

I ran the rmkilnerogstable.pl but that doesn't appear to have cleaned out the table since the incorrect server name was still in the dfmogsgrouppropertiestable after the script completed successfully.

I used the information that you provided earlier to drop the tables and clean out the dbversion information and that seems to have solved the problem.

Thank you very much for the assistance on this.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco