DFM device discovery stuck at 90%

Answered Question
Mar 14th, 2008
User Badges:

LMS 2.6, DFM 2.0.10


Any ideas as to why the DFM discovery process would stick at 90% for all devices not in the unknown or questioned categories?


Services are all running normally and I've reinitialized the databases but still no luck.

Correct Answer by Joe Clarke about 9 years 1 month ago

I can post the Release Notes from the bug, but direct database access is something that should only be done with TAC's assistance:


Symptom:


Group administration UI shows a different set of root group names for the various application hierarchies. For example, when you launch Common Services -> Groups -> Group admin, you will see CS@ for the CS group hierarchy, and RME@ for the RME group hierarchy. Because of this mismatch, DFM discovery of devices is stuck at 90%.


Conditions:


This happens when the hostname of the CiscoWorks server is changed from lowercase to uppercase or the vice versa, followed by a daemon manager restart.


Note: This issue will not appear if the hostname is changed to a different string.


Workaround:


1. Stop Daemon manager.

2. Delete the OGS tables from the all the databases. CS, DFM, Campus OGS tables are present in CmfDb and RMEOGStables will be present in RmeDb. Note: The user defined groups will be lost by deleting the tables.


The following are the commands:

In cmf.db:

drop table csogsgrouppropertiestable;

drop table csogsgroupcachetable;

drop table csogstagtable;

drop table csusergroupassociationtable;

delete from dbversion where component='cs';


drop table dfmogsgrouppropertiestable;

drop table dfmogsgroupcachetable;

drop table dfmogstagtable;

drop table dfmusergroupassociationtable; delete from dbversion where component='dfm';


drop table campusogsgrouppropertiestable; drop table campusogsgroupcachetable; drop table campusogstagtable; drop table campususergroupassociationtable; delete from dbversion where component='campus';


In rme.db:

drop table rmeogsgrouppropertiestable;

drop table rmeogsgroupcachetable;

drop table rmeogstagtable;

drop table rmeusergroupassociationtable; delete from dbversion where component='RME';


3. After this, open NMSROOT/etc/ogs/applications.reg in notepad and empty the contents (remove all the lines) and save the file.

4. Start Daemon manager.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4 (2 ratings)
Loading.
Joe Clarke Fri, 03/14/2008 - 09:58
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

At 90%, DFM has handed off the device to the DFMOGSServer for classification. If the devices hang there, there is a problem communicating with the DFM server. What databases did you reinitialize?

c.hulcher Fri, 03/14/2008 - 11:37
User Badges:

dsn=dfmInv dmprefix=INV

dsn=dfmEpm dmprefix=EPM

dsn=dfmFh dmprefix=FH


Deleted NMSROOT/objects/smarts/local/repos/icf/DFM.rps


Excerpt from TISServer.log:


14-Mar-2008|13:31:21.000|ERROR|TISServer|WorkerThread0|{TISOGSProxy}||.|getDeviceListForGroup:/[email protected]/System Defined Groups/Routers fails with reason:Group: /[email protected]/System Defined Groups/Routers does not exist

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|group:/[email protected]/System Defined Groups/Routers,members:[]

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDevice}||.|loadPartitionInfo

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|check isDevicePopulated from caller:updateDeviceStatus,updateHangingDevices

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDevice}||.|loadPartitionInfo

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|probe ogs for group membership of:/[email protected]/System Defined Groups/Switches and Hubs

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSProxy}||.|getRealDeviceListFromGroup:/[email protected]/System Defined Groups/Switches and Hubs

14-Mar-2008|13:31:21.000|ERROR|TISServer|WorkerThread0|{TISOGSProxy}||.|getDeviceListForGroup:/[email protected]/System Defined Groups/Switches and Hubs fails with reason:Group: /[email protected]/System Defined Groups/Switches and Hubs does not exist

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|group:/[email protected]/System Defined Groups/Switches and Hubs,members:[]

14-Mar-2008|13:31:21.000|DEBUG|TISServer|WorkerThread0|{TISOGSDeviceUtil}||.|done updating hanging devices,# of completed-devices/total:0/319

Joe Clarke Fri, 03/14/2008 - 11:43
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

There is a procedure to fix this documented in CSCsb46759. It involves direct database access, and it would be best if you opened a TAC services request, and had them walk you through it.

c.hulcher Fri, 03/14/2008 - 11:45
User Badges:

Is there any chance you can post or email the process? Unfortunately I am working on a customers LMS installation and don't even know if they have a current contract for the software.

Correct Answer
Joe Clarke Fri, 03/14/2008 - 11:53
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

I can post the Release Notes from the bug, but direct database access is something that should only be done with TAC's assistance:


Symptom:


Group administration UI shows a different set of root group names for the various application hierarchies. For example, when you launch Common Services -> Groups -> Group admin, you will see CS@ for the CS group hierarchy, and RME@ for the RME group hierarchy. Because of this mismatch, DFM discovery of devices is stuck at 90%.


Conditions:


This happens when the hostname of the CiscoWorks server is changed from lowercase to uppercase or the vice versa, followed by a daemon manager restart.


Note: This issue will not appear if the hostname is changed to a different string.


Workaround:


1. Stop Daemon manager.

2. Delete the OGS tables from the all the databases. CS, DFM, Campus OGS tables are present in CmfDb and RMEOGStables will be present in RmeDb. Note: The user defined groups will be lost by deleting the tables.


The following are the commands:

In cmf.db:

drop table csogsgrouppropertiestable;

drop table csogsgroupcachetable;

drop table csogstagtable;

drop table csusergroupassociationtable;

delete from dbversion where component='cs';


drop table dfmogsgrouppropertiestable;

drop table dfmogsgroupcachetable;

drop table dfmogstagtable;

drop table dfmusergroupassociationtable; delete from dbversion where component='dfm';


drop table campusogsgrouppropertiestable; drop table campusogsgroupcachetable; drop table campusogstagtable; drop table campususergroupassociationtable; delete from dbversion where component='campus';


In rme.db:

drop table rmeogsgrouppropertiestable;

drop table rmeogsgroupcachetable;

drop table rmeogstagtable;

drop table rmeusergroupassociationtable; delete from dbversion where component='RME';


3. After this, open NMSROOT/etc/ogs/applications.reg in notepad and empty the contents (remove all the lines) and save the file.

4. Start Daemon manager.

c.hulcher Fri, 03/14/2008 - 13:16
User Badges:

Would renaming the server to something else fix the problem? If I did that and then put the name back to the original, would to problem come back?

c.hulcher Fri, 03/14/2008 - 13:25
User Badges:

I just looked in Group Administration, all of the hostnames within the GUI are in lowercase. Any chance the GUI is forcing it that way and the database has the different case?

Joe Clarke Fri, 03/14/2008 - 20:12
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

That's sort of what's happening. What we typically see is that the DFMOGSServer is expecting to find the groups in lowercase, but Common Services has them in upper case (or vice versa), and this leads to DFM not being able to find the groups in question.


There is one other thing you can try before the SQL queries. Try running NMSROOT\bin\perl NMSROOT\MDC\tomcat\webapps\triveni\WEB-INF\classes\rmkilnerogstable.pl . It will clear out just the DFM OGS info which may be enough to fix this.

c.hulcher Fri, 03/14/2008 - 20:56
User Badges:

I was able to find the incorrect case within the dfmogsgrouppropertiestable, the CS@ entries within that table were all off by one capital letter.


I ran the rmkilnerogstable.pl but that doesn't appear to have cleaned out the table since the incorrect server name was still in the dfmogsgrouppropertiestable after the script completed successfully.


I used the information that you provided earlier to drop the tables and clean out the dbversion information and that seems to have solved the problem.


Thank you very much for the assistance on this.

Actions

This Discussion