cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
236
Views
0
Helpful
1
Replies

S31 and repeated nrconns. In search of clarity.

hschupp
Level 1
Level 1

In the readme for S31 it states "A postoffice daemon error causes postoffice to consume too much CPU/RAM upon repeated executions of nrconns (or nrget to obtain the DestinationConnectionStatus)."

We currently employ a healthcheck script that runs nrconns from Director and validates not only that the sensor is 'Alive' but that it is communicating properly. Since nrconns is being run from the Director does the issue still affect the sensor in the manner described? Thanks in advance for your assistance!

Hank Schupp

1 Reply 1

marcabal
Cisco Employee
Cisco Employee

I don't know whether or not you could run into this.

The problem was seen when nrconns was run repeatedly in a loop against a sensor.

If you run into this problem your sensor postoffice will stop responding to the nrconns. Then if you run top on the sensor you will see that postoffice is consuming 90%+ cpu.

In your scenario it depends on exactly how your script is implemented.

If your script is running nrconns on the director then it is not actually querying each sensor postoffice, you are just querying the director postoffice. The dircector postoffice may or may not have this same issue. If it does then when your script does the nrconns, the director postoffice will not respond to the nrconns, and so you have to do an nrstop followed by an nrstart on the director itself.

If your script is actually going to each individual sensor and running nrconns, then it is theoretically possible that you could see this issue. In which case the sensor postoffice will not respond and you have to do an nrstop and nrstart on the sensor.

If you keep getting responses from nrconns then you are not seeing this issue. If you stop getting a response then your healthcheck script is doing exactly what it should, and lets you know you have a problem and need to run nrstop and nrstart (you just won't know if the problem was caused by the nrconns in your health script or something else).

So I would recommend that you keep running your health check script. If you do start seeing problems with postoffice not responding to nrconns, then contact the TAC. They can get in contact with the developers assigned to this issue who may be able to help.

Just so you know, we have seen the issue in our test lab where we can be pretty rough on the sensors with our tests (we try to stress the sensor in ways users never would), but we have not to my knowledge heard of this issue being seen at an actual user site.

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: