DFM Discovery not working and effecting polling

Answered Question
Mar 29th, 2010

I have done everything possible to get DFM Polling to work.


1. Reinitialzed DB....restart server.


Alerts come in for a while I clear the reachability alerts and it stops....no alerts come in. Although I do know that we have active device alerts.



2. Attempted to bring database to last consistent state.


/* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

1) Stop the daemon manager
        - net stop crmdmgtd
        - wait 5 min that all the process stops
2) remove/rename any dfmEpm.log, dfmFh.log, dfmInv.log (if exists)
under the follow directory:
        - C:\Program Files\CSCOpx\databases\dfmEpm
        - C:\Program Files\CSCOpx\databases\dfmFh
        - C:\Program Files\CSCOpx\databases\dfmInv
3) Run the checkpoint recovery of  the dfmfh database
        - cd \progra~1\CSCOpx\databases\dfmfh
        - dbsrv10 -f dfmfh
4) Run the checkpoint recovery of  the  dfmEpmdatabase
        - cd \progra~1\CSCOpx\databases\dfmEpm
        - dbsrv10 -f dfmEpm
5) Run the checkpoint recovery of  thedfmInv database
        - cd \progra~1\CSCOpx\databases\dfmInv
        - dbsrv10 -f dfmInv
        - net start crmdmgtd


This works for while and now no alerts are coming in again....no windows update has occurred on the server....I'm out of ideas at the time.



Correct Answer by Joe Clarke about 6 years 11 months ago

Go to DFM > Configuration > Other Configurations > SNMP Trap Receiving.  Set the port to some other free UDP port (greater than 1023).


--


Please support CSC Helps Haiti


https://supportforums.cisco.com/docs/DOC-8895

https://supportforums.cisco.com

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
dionjiles Tue, 03/30/2010 - 10:04

DFM 3.2.0


I restarted the database several times....this has not solved my problem


net stop crmdmgtd


net start crmdmgtd

dionjiles Tue, 03/30/2010 - 11:36

All of my devices are in a suspended stated.....I cannot resume the devices for some reason.

Michel Hegeraat Tue, 03/30/2010 - 12:08

Just my two cents,


There is a rps file somewhere under CSCOpx/objects/smarts/


It can become corrupted


Remove all devices from DFM


Try to rename the rps it when all CW processes are stopped.


I believe it will create a new one, the old might be corrupt.


Cheers,


Michel

dionjiles Tue, 03/30/2010 - 12:46

I did that when I re-initialized the Database....


I think I fixed the problem....I will monitor it to see how long it works.



/* Style Definitions */ table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin:0in; mso-para-margin-bottom:.0001pt; mso-pagination:widow-orphan; font-size:11.0pt; font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-family:Calibri; mso-hansi-theme-font:minor-latin; mso-bidi-font-family:"Times New Roman"; mso-bidi-theme-font:minor-bidi;}

1) Stop the daemon manager
        - net stop crmdmgtd
        - wait 5 min that all the process stops
2) remove/rename any dfmEpm.log, dfmFh.log, dfmInv.log (if exists)
under the follow directory:
        - C:\Program Files\CSCOpx\databases\dfmEpm
        - C:\Program Files\CSCOpx\databases\dfmFh
        - C:\Program Files\CSCOpx\databases\dfmInv
3) Run the checkpoint recovery of  the dfmfh database
        - cd \progra~1\CSCOpx\databases\dfmfh
        - dbsrv10 -f dfmfh
4) Run the checkpoint recovery of  the  dfmEpmdatabase
        - cd \progra~1\CSCOpx\databases\dfmEpm
        - dbsrv10 -f dfmEpm
5) Run the checkpoint recovery of  thedfmInv database
        - cd \progra~1\CSCOpx\databases\dfmInv
        - dbsrv10 -f dfmInv
        - net start crmdmgtd

dionjiles Tue, 03/30/2010 - 13:18

Well it worked for a 30 minutes or so....all devices are back to Suspended mode.

Joe Clarke Tue, 03/30/2010 - 22:04

The devices moved to a suspended mode, or questioned mode?  Where are you seeing the devices in this mode?


When you say you reinitialized the databases, are you just doing the steps you listed in this thread?  If so, that is not reinitialization.  You might actually try a true reinit (including the RPS files) as documented in https://supportforums.cisco.com/docs/DOC-8796 .


--


Please support CSC Helps Haiti


https://supportforums.cisco.com/docs/DOC-8895

https://supportforums.cisco.com

dionjiles Wed, 03/31/2010 - 09:50

Hi,


I have been performing those steps as well,


Once again this morning.....when I import the devices it imports for a little while and stops.


Devices say they are in a known state.....but when I go to DDV they are in a suspended state.

Joe Clarke Wed, 03/31/2010 - 22:28

Start again from scratch.  Reinitialize all three of the DFM databases and the two RPS files.  When LMS comes back up, just import one device from DCR into DFM.  Does the device become Known?  Are you seeing it move to a Suspended state in DDV?  If so, post the NMSROOT/objects/smarts/local/logs/DFM.log and DFM1.log as soon as you notice the problem.


--


Please support CSC Helps Haiti


https://supportforums.cisco.com/docs/DOC-8895

https://supportforums.cisco.com

dionjiles Thu, 04/01/2010 - 06:39

Yes still same results....I performed DB re-initialization and imported one device.


Devices gets imported to known status 15 minutes later suspended state.

Attachment: 
dionjiles Fri, 04/02/2010 - 04:56

Where is this done....I don't recall setting LMS to send snmp traps to

a trap receiver


-Dion



On Apr 2, 2010, at 12:22 AM, "jclarke"

dionjiles Fri, 04/02/2010 - 05:31

The only place I know where to set up SNMP traps is under DFM>Device mgmt>snmp trap notification> and I have nothing set up here on either servers.




Dion Jiles

dionjiles Mon, 04/05/2010 - 09:47

Believe it or not this seems to be working now with your recommendations.


Is this a permanant fix to apply or both servers?

Joe Clarke Mon, 04/05/2010 - 22:17

If I didn't believe it, I wouldn't have suggested it.  We actually see this a lot.  DFM is not a trap-based fault management system.  While it can receive and process certain traps, it is designed to rely on polling the determine when faults occur on the network.


I wouldn't call this a permanent fix.  I would instead, recommend you start throttling traps to the DFM server.  Only send those trap which DFM can understand (see http://www.cisco.com/en/US/partner/docs/net_mgmt/ciscoworks_device_fault_manager/3.2/user/guide/TrapFwd.html ).  Make sure you're only sending traps from devices which DFM is managing.


--


Please support CSC Helps Haiti


https://supportforums.cisco.com/docs/DOC-8895

https://supportforums.cisco.com

dionjiles Tue, 04/06/2010 - 00:14

Understood I only have Cisco devices imported now. As before I believe

I just imported everything from Foundry Load balancer to Juniper

Firewalls.


-Dion



On Apr 6, 2010, at 1:17 AM, "jclarke"

dionjiles Thu, 04/01/2010 - 08:49

Just to add to....I have a secondary Ciscoworks server and I tried to add one device to it as well....same results.


Could the DCR be corrupt?

Actions

This Discussion