×

Warning message

  • Cisco Support Forums is in Read Only mode while the site is being migrated.
  • Cisco Support Forums is in Read Only mode while the site is being migrated.

HUM - Polling the devices takes long time

Unanswered Question
Aug 29th, 2009
User Badges:

Hello,


I have approx. 20 devices in the HUM. I have selected all the devices under Poller Management with 7 instances. It takes nearly half hour. How can this be improved. And how can I verify whether last poll was successful or not.


Is there a way to poll the devices in the background. Coz when I click next after making selection under Poller Management, it starts polling in the foreground.


Thanks

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Sat, 08/29/2009 - 13:01
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

I'm not sure I fully understand. All of the pollers run in the background automatically. The result of the last polling cycle is visible under HUM > Poller and Template Management > Poller Management (if there were any errors, they will be visible in the Status column).


Can you point out exactly where you're seeing the 30 minute delay? A screenshot would be helpful.

tech_trac Sat, 08/29/2009 - 13:53
User Badges:

I have attached the screenshot.


If I go to Poller -> Poller management -> Select an already defined Poller (with 20+ devices and 7 templates) -> Edit -> Click Next or Finish, it starts polling and shows the attached screen.


Secondly, is the previous run successful if there are no errors in the Status column. Should all errors be looked into such as instance not available etc.




Joe Clarke Sat, 08/29/2009 - 14:01
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

What templates did you select? What type of devices are in this poller? How heavily loaded is the server? HUM will attempt to walk the various tables (as defined by the templates) to obtain the selected (or all) instances. If the devices are slow to respond to SNMP queries for the tables in question, or the server is heavily loaded, this can take a long time.


Yes, if no errors are seen, then all instances were successfully polled for the previous polling cycle. No such instance errors can point to an SNMP problem on the device, but it could also be that high-capacity counters weren't available for a specific interface, so HUM fell back to low-capacity counters. This is only true for interface utilization.

tech_trac Sat, 08/29/2009 - 17:19
User Badges:


I have selected CPU, Memory, Interface Utilization, Temperature, Device Availability, Interface Availability, Interface Error templates.


There are 27 devices including ASA, ACE Modules, Cat65K, 2800 Routers, GSS, FWSM.


The server performance seems to be the cause. CPU is constantly 1% but the 'PF Usage' is 5.44 GB. The physical memory on the server is 4GB out of which 500MB is available while the process is running. How can I reduce the PF Usage.



Joe Clarke Sat, 08/29/2009 - 20:01
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

PF usage is probably not the issue. The issue most likely stems from time take to poll the devices. If you start a sniffer trace filtering on SNMP traffic to all of the devices in the poller, then trigger the problem, the resulting capture file should shed some light on the delay.

tech_trac Sat, 08/29/2009 - 22:30
User Badges:

Ok.


In the ASA logs, I see the following in big number


ASA-3-212005: incoming SNMP request (563 bytes) from IP address HUMIP Port 1745 Internet "management" exceeds data buffer size, discarding this SNMP request.


What does this mean ?

Joe Clarke Sat, 08/29/2009 - 22:33
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

The SNMP request is too large for the device to process. This has to do with the number of templates in your poller. Try breaking the templates out into separate pollers (e.g. one poller to handle memory, CPU, and environment, one to handle interface utilization, one to handler errors, and one for availability).

tech_trac Sat, 08/29/2009 - 22:45
User Badges:

I just added a single ASA device with CPU Utilization only.


The response came soon with


"No instances are found in the devices. This could be because the device is unreachable, or the device does not have the instances for the selected templates or the time out value is low."


Is the default CPU utilization template supported by ASA.


Thanks.

Joe Clarke Sat, 08/29/2009 - 22:57
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

No, it might not. The object used is cpmCPUTotal5minRev which is a new object in the CISCO-PROCESS-MIB (and not support in most versions of ASA code). The old, deprecated object, cpmCPUTotal5min IS supported by all versions of ASA code. You will need to create a new template for this older object, and apply that to your ASA devices.

tech_trac Sat, 08/29/2009 - 23:02
User Badges:


Is the same applicable to FWSM. I am running version 3.2(2).

tech_trac Sat, 08/29/2009 - 23:12
User Badges:

I tried cpmCPUTotal5min only with ASA device and it still gives the same response. The ASA code is 8.0(4).

tech_trac Sun, 08/30/2009 - 01:45
User Badges:

When I do snmpwalk with the ASA device on (.1.3.6.1.4.1.9.9.109.1.1.1) it returns


CISCO-PROCESS-MIB::cpmCPUTotalPhysicalIndex.1 = INTEGER: 1

CISCO-PROCESS-MIB::cpmCPUTotal5sec.1 = Guage32: 5

CISCO-PROCESS-MIB::cpmCPUTotal1min.1 = Guage32: 5

CISCO-PROCESS-MIB::cpmCPUTotal5min.1 = Guage32: 4


Yet when I create the template with same MIB object, it fails with the poller saying 'No instance available'.


How could this be ?


Joe Clarke Sun, 08/30/2009 - 09:54
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

This is a bug in HUM. HUM is expecting the value of cpmCPUTotalPhysicalIndex to be 0. When it's not, HUM ignores the instance. Unfortunately, there is no workaround at this time.

Joe Clarke Mon, 08/31/2009 - 13:48
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

The problem was actually slightly different, but I was able to track it down, and I filed CSCtb68766 to track the issue. I created a patch which fixes the problem. You can get the patch by contacting the TAC.

Joe Clarke Sun, 08/30/2009 - 09:01
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

Probably, yes.

tech_trac Sun, 08/30/2009 - 23:10
User Badges:

I am getting two errors in big numbers on the pollers


1. No such instance available

2. Request timed-out. Device may be down.


With regards to 1, I believe it is because the MIB object may not be supported by the device. In this case, is it better to just ignore these errors. Coz creating a poller for every device and then checking to ensure that only supported MIB objects are selected is quite a tedious process.


What could be the reason for # 2. Example below for ACE Module


ifOutErrors bvi2 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifOutDiscards bvi2 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifInDiscards bvi2 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifInErrors bvi2 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifOutErrors vlan310 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifOutDiscards vlan310 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST

ifInDiscards vlan310 Transient 2 Request Timed-Out. Device may be down. Mon, Aug 31 2009, 11:01:09 GST



Thanks.

Joe Clarke Mon, 08/31/2009 - 10:22
User Badges:
  • Cisco Employee,
  • Hall of Fame,

    Founding Member

You may be correct. In that case, those errors can be ignored. They will not cause problems, but will result in needless polling to affected devices.


The ACE module may be overwhelmed by the SNMP polling, and thus timing out for certain instances. Increasing the HUM SNMP timeout may help.

Actions

This Discussion