High CPU utilization on CSS 11501 version sg0750303

Answered Question
Mar 5th, 2007

Hi everyone,

I have the problem about High CPU utilization on CSS 11501 version sg0750303.

Our customer has used one pair of CSS 11501 (active-standby).

As a matter of convenience, called "Old CSS" after here in this post.

However traffic via Old CSS had been increasing so customer decided to add one more

pair (active-standby) of CSS to separate traffic.

Yesterday we installed new two CSS 11501 version sg0750303 (active-standby).

As a matter of convenience, called "New CSS" after here in this post.

Today, active CSS 11501 and standby CSS 11501 which were installed yesterday (New CSSs)

indicates High CPU utilization.

Active CSS 11501:

Peak CPU utilization: about 85%

Average CPU Utilization: about 60%

Standby CSS 11501:

Peak CPU utilization: about 40%

Average CPU Utilization: unknown

I do not understand why CPU utilization of both New CSSs become high.

The traffic pass through New CSS less than Old CSS, because the traffic is separated into

Old CSS and New CSS.

And CSS's configuration parameters (service, content, access-list) also less than Old CSS,

because real servers are also separated into Old CSS and New CSS.

Old CSS indicated average of CPU utilization about 20% before installing New CSSs yesterday,

in spite of all traffic pass through Old CSS only.

I wrote "New CSS remains High CPU utilization", however end users do not feel the

performance issue (e.g., performance delay, communication failure and so on) and

the traffic pass through New CSS normally.

So I have the question "CSS 11501 sg0750303 remains High CPU utilization on normal situation ?"

And customer uses MTRG to poll SNMP for Old CSSs and New CSSs.

So I have the question "CSS 11501 sg0750303 become High CPU utilization in case of receiving

SNMP polling ?".

Or if this situation is abnormal we need to start investigation.

Would you please let me know how do we investigate this situation.

I found the DDTS CSCek57080 "Performance issue using arrowpoint-cookie with ASR".

Release note of this DDTS says that

----------

A customer was using a CSS pair configuration where arrowpoint-cookie

is being used along with a redundant-index on many content rules. When

the flow rate increased to a few hundred flows/sec, the peer message

queue of the CSS receiving ASR related message began to fill up.

When the peer message queue became over subscribed, the CPU increased

and the CSS became unstable.

----------

New CSSs have configured redunrant-index on two content rules, and end users do not feel the

performance issue (e.g., performance delay, communication failure and so on) and

the traffic pass through New CSS normally.

So I think this DDTS does not related to this case.

Your information would be greatly appreciated.

Best regards,

I have this problem too.
0 votes
Correct Answer by Gilles Dufour about 9 years 6 months ago

I believe you already opened a service request with the TAC and the message you receive is the same as I provided.

The high cpu is due to traffic handling.

To reduce the cpu, you need to add more modules to your chassis [the processing will be split between the modules].

Gilles.

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
Gilles Dufour Tue, 03/06/2007 - 00:35

we'll need to capture some data.

do the following

show system-resources

llama

symbol-table load SPRITZ

shell 1 1 spy

shell 1 1 spyReport

shell 1 1 spyReport

shell 1 1 spyStop

symbol-table unload SPRITZ

cpu hog 1 1

Send us the result here.

Thanks,

Gilles.

snakayama Tue, 03/06/2007 - 18:12

Gilles,

Thank you very much for your answer and instruction.

We set up simple lab environment with same version to confirm that this symptom

(high cpu utilization constantly) can be reappeared or not.

However the symptom can not be reappeared on our lab.

CSS 11501 in our lab is pointing about 5% as peak.

So we need to gather information by using commands you instructs us on customer environment.

However CSS 11501s which have the symptom on customer site are working on live network.

I concern that customer might ask us whether the commands affect the working of CSSs,

such as, is there more impact of CPU by using the commands ?.

Would you please let us know the commands you instruct us affect the working of CSSs ?

Of course, if customer allow us to gather information on customer site, we might gather

information by using the commands midnight at the time user traffic is more small and less affect.

Best regards,

Gilles Dufour Wed, 03/07/2007 - 03:13

as any commands it requires some cpu to run it.

But the impact should be low.

Gilles.

snakayama Wed, 03/07/2007 - 17:48

Gilles,

Thank you very much for your reply.

We are going to capture data on customer site after obtaining the customer's permission.

And once we get capture data, I upload it to here.

Best regards,

snakayama Thu, 03/08/2007 - 01:57

Gilles,

Thank you very much for your cooperation.

I got the capture you instructed us.

The following are additional information from our customer.

At time user traffic path through the active CSS, active CSS indicates;

CPU utilization always range of 30% - 40%

Peak CPU utilization about 60% - 80%

At time there is no user traffic pass through active CSS, active CSS indicates;

CPU utilization always range of 0% - 5%

Attached files are named "Active CSS.log" and "Standby CSS.log".

"Active CSS.log" is captured on active CSS and "Standby CSS.log" is captured on

standby CSS.

I found the following process is using resource by looking the output of

"shell 1 1 spyReport" command.

On active CSS,

tFlowMgrPktR 8ba24070 50 26% ( 1469) 20% ( 26)

On standby CSS,

fmPeerMsgTas 8a511510 50 16% ( 176) 10% ( 7)

Your comment would be greatly appreciated.

Best regards,

Attachment: 
yoshitaka.kato Mon, 03/19/2007 - 03:05

Hi Gilles.

It is interested in this problem.

I have encountered a similar Symptom, too.

What is the process tFlowMgrPktR & fmPeerMsgTas ?

This cause be due to Replication ?

Best regards,

Gilles Dufour Mon, 03/19/2007 - 03:47

tFlowMgrPktR handles traffic

fmPeerMsgTas handles communication with the other CSS.

Gilles.

Gilles Dufour Mon, 03/19/2007 - 03:45

do you have connection replication configured ?

redundant-index and ASR ?

Seems like the standby is busy communicating with the active.

Gilles.

yoshitaka.kato Mon, 03/19/2007 - 04:17

Hi Gilles.

Thanks you for the reply.

My CSS configured redundant-index and ASR.

In the solution of this problem, is there method except that redundant-index and ASR not configured ?

Please advise.

Best regards,

Gilles Dufour Mon, 03/19/2007 - 05:34

but you do not have a problem.

A high cpu on the standby due to ASR is not an issue. This is normal.

The standby, even if not receiving traffic, has to do the same job as the active to be ready in case of a failure of the active CSS.

So, what you see is normal. No action is required.

Gilles.

snakayama Mon, 03/19/2007 - 18:48

Hi Gilles,

Thank you very much for your reply.

I understand what you said, that is,

On active CSS, CPU resource is occupied by tFlowMgrPktR which caused by handling traffic.

On standby CSS, this is normal behavior and is not the problem.

Can I understand active CSS is also working normally ?

I think the traffic of "handling traffic" you said means not only the total of incoming traffic

into CSS but also the number of connection and packet per second and so on.

I would like to know is there any related known DDTS about this behavior

(CPU remains middle to high) of active CSS.

Do you know about it ?

Should I open SR to TAC to confirm it ?

Your information would be appreciated.

Best regards,

Correct Answer
Gilles Dufour Tue, 03/20/2007 - 03:41

I believe you already opened a service request with the TAC and the message you receive is the same as I provided.

The high cpu is due to traffic handling.

To reduce the cpu, you need to add more modules to your chassis [the processing will be split between the modules].

Gilles.

snakayama Wed, 03/21/2007 - 18:36

Hi Gilles,

Thank you very much for your reply and support. I understand it.

Best regards,

Actions

This Discussion