cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
1689
Views
0
Helpful
3
Replies

CiscoWorks LMS 3.1: HUM and Web access issue...

section09
Level 1
Level 1

Hi there,

I would like to seek your counsel with regards to the recent incident that we encountered with our CiscoWorks LMS 3.1 running on Solaris 10.

A few days ago, we received a few complains from our users that they were unable to view the pollers or contents of the TOP-N <CPU | Memory | Interface> Utilization on the HUM’s home page (see attached file, hum_homepage.jpg).

We (tech support guys) checked the pollers’ status and they’re all Active (see attached file, pollers.jpg); we notified our Administrators and requested for a copy of the HUMPortal.log (see attached file, humportal.zip).

The following day, while waiting for the HUMPortal.log, we received reports that users accessing the HUM home page are stuck with the message “loading…” on all TOP-N Utilization reports.

As we were about to re-queue our request from our administrators, we received reports (a few hours ago) that our CiscoWorks LMS is inaccessible via web browser.

We alerted our administrators and they attempted to restart the Daemon Manager (/etc/init.d/dmgmtd); they were able to successfully shut it down but took several attempts to start it up.

As they attempt to start up the Daemon Manager, they encountered the following error message:

# /etc/init.d/dmgtd start

Error: Daemon Management could not start. Trying again

Error: Unable to bind to port, please check port (42340) state and permissions.

Error: If the port is in use, please try starting Daemon Manager once it become free.

After several minutes passed, they were able to start up the Daemon Manager successfully although the CiscoWorks LMS is still inaccessible via web browser.

We requested for a copy of the “pdshow –brief” log and every daemon (as far as I know) seemed to be working fine but it never hurts to ask for help to verify my findings (see attached file, pdshow0818a.log).

As of the moment of  this writing, we are still unable to access CiscoWorks LMS via web  browser; any insight or suggestion on the next step to take in  troubleshooting and eventually solving this problem is very much  appreciated.

3 Replies 3

ngoldwat
Level 4
Level 4

The port 42340 issue is most likley not related to the HUM issue you described. The latest log entry you provided indicates "Possible reasons for 'No Data' in HUM Portlets could be either Poller is not configured, or Poller is deactivated, or Poller Failure has occurred, or Summarization job did not start." The error is repeated back to the beginning of the file dated Aug 14 16:39.

To focus on the HUM issue:

Did any change occur on or around the 14th that you are aware of?

Please provide some additional debugs:

1. Please enable debug for "UPMProcess"

Log files are stored here: #/var/adm/CSCOpx/log/#

- HUMPortal.log
- upm_summarization.log
- jrm.log
- upm_process.log

2. To set log levels:

*Health and Utilization Monitor > Admin > System Preferences*.

- Select Log Level Settings.
- Select the application module from the drop-down list.
- Select the Debug log level from the Logging Level drop-down list.

Thanks,

Nick

Hi Nick,

Apologies for the late reply and thanks for the advice.

I recently found out that a one of our server room where our LMS server is located experienced a brief power failure resulting to all of the servers within that area to experience application / operating system related problems.

One of our administrators have been checking the server and suspected that some of the processes (daemon manager included) may have end up being a zombie process due to the incident which may explain why our users can't access the LMS via web browser despite pdshow reports that everything is in working order.

I have forwarded to our administrators to enable the debug for UPMProcess and setting the log levels of HUM; hopefully to receive some news from them soon.

Hi Nick,

A follow-up update with regards to the issue encountered by our CiscoWorks LMS 3.1.

It turns out that our server room experienced a power surge which affected all servers including our LMS server on the same day that our users started to notice somethings wrong with HUM.

According to our administrators, they need to momentarily power down our server to fix the problem; no specific information was given if the problem was hardware or software (operating system) in nature.

Thanks again for your assistance earlier; it may take a while before I'm able to provide any update and hopefully, the problem that our users encountered before was due to what happened at our server room.

Cheers.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: