cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
628
Views
0
Helpful
3
Replies

Software forced crash

ahnyongsik
Level 1
Level 1

Hi,

um, how can I ask to you...I just explain our problem first.

We are using 7200 VXR for PDSN but this router caused reloaded by himself.

This is message,"System was restarted by error - a Software forced crash, PC 0x60A00A04" and the IOS version is " 12.3(14)YX4"

One more thing, " %SYS-3-CPUHOG: Task is running for (104004)msecs, more than (2000)msecs (0/0),process = AAA ACCT Proc." What does mean of this error message? Is the main problem come to CPU?

I also attached our " show tech-support"

Plz,, give me solution...

3 Replies 3

Giuseppe Larosa
Hall of Fame
Hall of Fame

Hello Yongsik,

the log message means that a process AAA ACCT Proc has used the cpu for 104 seconds more then 2000 msec.

This is probably the reason of the software crash the system decided to reload because this process was using the cpu for too long: 104 seconds are like 10 thousands years for modern cpus.

we can see:

Last reset from watchdog reset

the watchdog process is the one that looks for cpu usage this confirms what I've written above.

In the log you can see several cpuhog messages.

near the software reload you can see

AAA/ACCT(0145C159): Accouting method=AAA (RADIUS)

AAA/ATTR(0145C159): cursor init: 65653348 64532294 none unknown

AAA/ATTR(0145C159): find: 64532324 0 00000009 username(345) 3 wap

AAA/SG: Server group ref count for public group AAA raised to 25

AAA/SG: Server group wrapper ref count for public group AAA raised to 25

AAA/ACCT is accounting.

there were 146 concurrent CDMA users when the system reloaded.

AAA/ACCT is the process that uses most cpu:

61 0 701290344 0 6980 1278432 1278432 AAA ACCT Proc

62 0 1772314480 0 6980 0 0 ACCT Periodic Pr

You may have hit a bug:

CPUHOG at the Time of Normal Router Operation

Most of the time, these error messages are due to an internal software bug in the Cisco IOS Software.

The first step to troubleshoot this sort of error message is to look for a known bug. You can use the Bug Toolkit ( registered customers only) to find a bug that matches the error. In the Bug Toolkit page, click Launch Bug Toolkit, and select Search for Cisco IOS-related bugs. In order to narrow your search, you can select your Cisco IOS software version under number 1. Under number 3, you can perform a keyword search for "CPUHOG, " where process is the corresponding process, such as Virtual Exec or IP Input.

You can upgrade to the latest Cisco IOS Software image in your release train to eliminate all fixed CPUHOG bugs.

see

http://www.cisco.com/en/US/products/hw/iad/ps397/products_tech_note09186a00800a6ac4.shtml#topic6

Hope to help

Giuseppe

Hi, Mr.Giuslar

Thank you for your reply.

your information is very useful to me.

And I asked about Software forced crash.

In this case, the most of issue came from over usage CPU on AAA ACCT.

Actually, we have a Three PDSN(7206 VXR) and they are using different IOS version.

This problem came from IOS version 12.3(14)YX4 and 12.3(11)YF4 but PDSN#3 did not display this error message even though PDSN#3 also is using IOS version 12.3(11)YF4.

Mr.Giuslar, I also tried to find suitable IOS version in your site but I could not find it. I knew that you already mentioned to me how to find fixed IOS but when I searched bug error message your web site did not have anything.

If you can recommend for solve this problem, Please let me know which IOS version is suitable to our PDSN systems.

And then I also attached PDSN#2's tech-support file because PDSN#2 also has at the same problem such as PDSN#1.

Thank you for your BEST support!

Hello Yongsik,

also the second router was reloaded by the watchdog process.

I think it is wise to open a service request to get info about what IOS code to use for upgrade directly from TAC.

I wasn't able to find an exact match for your issue but there are several bugs that contain CPUhog on their description.

If you have one router that doesn't experience the problem even if it has the same IOS image you should investigate what are the differences in configuration or in usage level between the two devices you could find a workaround from any config difference or an indication that the problem is triggered by the volume of connections served

Hope to help

Giuseppe

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco