Bus error on Cisco 2821

Unanswered Question
Dec 24th, 2009

Hi All,

We have Cisco 2821 router at one of our premier customers. Couple of days back the router got rebooted and upon investigation we found that due to some bug the router got rebooted. The show version is as below

uptime is 23 hours, 11 minutes

System returned to ROM by bus error at PC 0x4038F110, address 0x2FFFFFF at 12:03:29 GMT Tue Dec 22 2009

System restarted at 12:04:40 GMT Tue Dec 22 2009.

Upon further diagnosis on cisco site we came to know that ther is Cisco bug attached to such failures.

The stack trace decoded symbols are:
pak_pool_item_ret
datagram_done
gt96k_mbrd_scc_tx_intr
gt96k_process_sdma_int
fellowship_mbrd_int_wrapper
fellowship_check_others

Possible bug matches are listed below. Bugs with a score of .90 or more

are the most likely candidates:


ScoreBugidStatusFixed InDuplicateTitle
0.97CSCsi71675DCSCsk04674ISR Crashed on configuring QoS in System Test scenario

Can any one help us here to identify the work around for the said bug as we need to resolve the issue on priority to avoid impact customer business. The vendor FE is at the site and waiting for our instructuctions.

Your quick response is appreciated

Regards

Sameer

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 4 (1 ratings)
Loading.
Paolo Bevilacqua Thu, 12/24/2009 - 04:00

It is not said there is a work around.

If it happened just once, monitor.

If it happens frequently, upgrade or change IOS.

If it keeps happening, replace router.

Please do not solicit urgency as this is a freely contributed forum. For guaranteed support, use the TAC:;

sameer.mulgund Thu, 12/24/2009 - 04:09

Thanks a lot for the quick response.

Sorry for soliciting on the urgency and will ensure that it wont happen in future.

Can you please suggest any specific IOS or upgradataion from current will work?

Regards

marikakis Thu, 12/24/2009 - 04:25

Agree with Paolo. The bug you posted doesn't show a version in which it is fixed. Also, you didn't tell us if the description of the bug (ISR crashed on configuring QoS) makes sense in your environment. Was anyone configuring QoS when the router crashed? (This sounds strange to me, because if I was configuring something and router crashed right away, I would know what I was doing before the router rebooted and would try to avoid it for the time being). If this was so, maybe you could just avoid touching the router for a while until you receive a more definitive answer. I know this doesn't sound like the best workaround one ever came up with, but might do for urgent case during holidays. If the description doesn't make any sense, then try to see if there is any other bug that matches more your case and if resolution or workarounds exist for that.

Ronit Bhattacharjee Thu, 12/24/2009 - 08:36

Where did you decode the tracebacks? Please upload the original tracebacks

From the decodes you've provided, it seems like CSCsg90243, resolved in 12.4(15)T

sameer.mulgund Thu, 12/24/2009 - 20:37

Hi Ronit

thanks for the inputs. I will get the original trace backs and udpate you. However just want to make note here that this  the second instance where router was rebooted with similar error.

Cisco IOS Software, 2800 Software (C2800NM-ADVIPSERVICESK9-M), Version 12.4(11)T, RELEASE SOFTWARE (fc2)

Technical Support: http://www.cisco.com/techsupport

Copyright (c) 1986-2006 by Cisco Systems, Inc.

Compiled Sat 18-Nov-06 17:16 by prod_rel_team

ROM: System Bootstrap, Version 12.4(13r)T, RELEASE SOFTWARE (fc1)

xxxxxxx uptime is 1 hour, 12 minutes>>>>>>>>>>>>>>>>>

System returned to ROM by bus error at PC 0x4038F110, address 0x3FFFFFF at 14:58:10 GMT Mon Jul 13 2009 >>>>>>>>>>>>>>>>>>>

System restarted at 14:59:22 GMT Mon Jul 13 2009

System image file is "flash:c2800nm-advipservicesk9-mz.124-11.T.bin"

At that time we had replaced the router. But now the problem recurred.

I will come up with additional inputs to help in investigations.

thanks again.

Ronit Bhattacharjee Thu, 12/24/2009 - 23:59

Hi Sameer,

Bus errors are mostly due to bugs. So if you're running the same IOS and similar configuration, chances are that the router will crash again

Paolo Bevilacqua Sat, 12/26/2009 - 04:27

12.4(11)T is such an old image, likely riddled with bugs.

Whoever suggested to replace the router instead, is not a professional of this industry.

marikakis Sat, 12/26/2009 - 04:56

Hi Paolo,

You typically speak the tough, unbearable truth. Such things might happen because customer pressure can be even more unbearable. Customers sometimes want the engineer to take the most extreme and costly measures to be satisfied that they receive the best possible treatment. Engineers on the other hand sometimes feel they have to do something, anything, to show to the customer they are actively trying to resolve the issue. Also, in this case you have to admit that the term 'bus error' can sound to non-programmers a lot like being most possibly hardware related.

Kind Regards,

Maria

Paolo Bevilacqua Sat, 12/26/2009 - 05:32

Right.

Now consider that even under pressure, doctors do not do surgery unless necessary.

Why should we be different

marikakis Sat, 12/26/2009 - 05:40

I am not saying that we should. I was implicitly describing some 'feelings' that we need to fight against. In any case, the doctor metaphor for this profession causes a lot of stress to people, sometimes for no reason. Unless of course the customer in this case is some doctor performing remote surgery by sending traffic through this router! Good luck to the patient in this case!

sameer.mulgund Sun, 12/27/2009 - 19:23

Hi All,

thanks a ton to all your inputs and suggestions. We have upgraded the IOS and currently the router is under monitoring.

I will come back in case of further issues.

Regards

sameer

Actions

This Discussion