05-26-2010 03:06 AM - edited 03-07-2019 12:34 AM
Hello community,
Has someone ever meet this kind of message on a WS-X6748-GE-TX module
=> WS-C6509-E /// with redundant WS-SUP720-3B (s72033-ipservicesk9_wan-mz.122-18.SXF7.bin)
This message has caused the system to PwrDwn the module.
It is the first time the message has been met, I have not tried yet to insert the module in another slot BUT would like to know if it make you think to a hardware issue on the module, or instead a communication problem with the sup ?
------------------------------------------------------------------------------
%PM_SCP-SP-2-LCP_FW_ERR_INFORM: Module 3 is experiencing the following error: Interrupt counters cumulative, (10s
critical/noncritical): ROINT[0]: totalcalls=1670, p2aecc1=157, p2necc1=677, ecc2=1168, ffifopb2ar=324, ffifopb2n=826, argospktin=1, pb2arinterm=405, (746/232). ROINT[2]: totalcalls=1, aricjacrc=1. JAINT[0]: total=83, drri0=83, (9/0).
%PM_SCP-SP-1-LCP_FW_ERR_POWERDOWN: Module 3 will be powered down due to firmware error: RO[0] (746 critical int in the last 10s).
%C6KPWR-SP-4-DISABLED: power to module in slot 3 set off (excessive interrupt)
------------------------------------------------------------------------------
Thanks a lot for your recommendations about the message;
Kind regards.
Karim
Solved! Go to Solution.
05-26-2010 05:04 AM
Hello Karim,
I would not test the linecard in more then another slot, we had a case where a linecard caused damages to switching fabric connections and we had to replace both the linecard AND the chassis to solve the issue with TAC support
>> was wondering if in the Cisco documentation there was some details about the "sub-"message => Interrupt counters cumulative and also about the condition "powered down due to firmware error" if we can get furhter details about these particular messages.
more info is not available as far as I know.
Hope to help
Giuseppe
05-26-2010 04:09 AM
Hi karim,
As per the ouput from cisco interpreter please see the
%PM_SCP-6-LCP_FW_ERR_INFORM (x1): Module [dec] is experiencing the following error: [chars]
Explanation: This message indicates that the firmware of the module detected an error condition. The module is informing the supervisor engine about the error condition. [dec] is the module number, and [chars] is the error. This could be a transient issue.
Recommended Action:
1. Try resetting the module (soft-reset) using the command hw-module reset.
2. If the error message still displayed Power down the switch. Do the hard reset
by pulling the module and reseating again. Power on the switch and monitor the
error message.
3. If the error persists, try reseating the module in another free slot. If the
error still displayed when moved to other module then there is no fault with the
chassis but module might be faulty.
4. Issue the command show test {module_number} before and after physically reseating
the module to make sure module is not faulty. Make sure you have configured set
test diaglevel complete command (a reset is required to enable this diagnostic
mode).
5. If the error still persists with the module, you may have to replace the module.
%C6KPWR-SP-4-DISABLED: power to module in slot [dec] set [chars]
Explanation:This message indicates that the module in the indicated slot was powered off for the indicated reason. [dec] is the slot number, and [chars] indicates the
power status. In most cases this message appears at switch bootup/reload or at line card insertion and can be ignored.
Recommended Action: Ensure that this message appeared during normal operation of the switch. If so, try:
1. Reseat and firmly fix up the module in the chassis.
2. Raise the diagnostic level to complete using the set diagnostic bootup level command.
3. Reset the line card and se the command show diagnostic module for the test results on module. This will confirm hardware sanity of the line card.
4. Monitor the switch operation. The recovery procedure depends on the reason indicated
%C6KPWR-4-DISABLED (x1): power to module in slot [dec] set [chars]
Explanation: The module in the indicated slot was powered off for the reason stated in the error message.
Recommended Action: Recovery depends on the indicated reason. Using the information provided in the error message, troubleshoot and resolve the power problem. If
necessary, replace defective components.
Hope to help !!
Ganesh.H
Remember to rate the helpful post
05-26-2010 04:17 AM
Hello Karim,
the message is described in 12.2SX error message guide
http://www.cisco.com/en/US/docs/ios/12_2sx/system/messages/sm2sx06.html#wp1032022
Error Message %PM_SCP-2-LCP_FW_ERR_INFORM: Module [dec] is experiencing the following error: [chars]
Explanation The linecard is reporting an error condition, where [dec] is the module number, and [chars] is the error. This condition is usually caused by an improperly seated linecard or a hardware failure. If the error message is seen on all of the linecards, the cause is an improperly seated module.
Recommended Action Reseat and reset the linecard or the module. If the error message persists after the module is reset, copy the message exactly as it appears on the console or in the system log. Research and attempt to resolve the issue using the tools and utilities provided at http://www.cisco.com/tac. With some messages, these tools and utilities will supply clarifying information. Search for resolved software issues using the Bug Toolkit at http://www.cisco.com/pcgi-bin/Support/Bugtool/launch_bugtool.pl. If you still require assistance, open a case with the Technical Assistance Center via the Internet at http://tools.cisco.com/ServiceRequestTool/create, or contact your Cisco technical support representative and provide the representative with the information you have gathered. Attach the following information to your case in nonzipped, plain-text (.txt) format: the output of the show logging and show tech-support commands and your pertinent troubleshooting logs.
the suggestion is to extract and insert again the linecard in the same slot.
if this doesn't solve I would consider to open an RMA for linecard substitution.
Have you got other 6748 linecards working well on the same chassis?
Hope to help
Giuseppe
05-26-2010 04:52 AM
Thanks friends for your answer.
I was wondering if in the Cisco documentation there was some details about the "sub-"message => Interrupt counters cumulative and also about the condition "powered down due to firmware error" if we can get furhter details about these particular messages.
Anyway I will apply Cisco recommendations of : soft_reset --- hard_reset ---changing_slot ---- RMA
Yes two other WS-X6748-GE-TX modules are running in the chassis (Status OK ; Hw 2.5 ; Fw : 12.2(14r)S5 in the show module)
Have a great day.
Regards.
Karim
05-26-2010 05:04 AM
Hello Karim,
I would not test the linecard in more then another slot, we had a case where a linecard caused damages to switching fabric connections and we had to replace both the linecard AND the chassis to solve the issue with TAC support
>> was wondering if in the Cisco documentation there was some details about the "sub-"message => Interrupt counters cumulative and also about the condition "powered down due to firmware error" if we can get furhter details about these particular messages.
more info is not available as far as I know.
Hope to help
Giuseppe
05-26-2010 05:13 AM
Thanks Giuseppe for this relevant feedback.
Based also on the very last post of following discussion --- https://supportforums.cisco.com/message/769893#769893 --- and also on the critical environement, I think I will directly proceed to a RMA of the module and check afterwards.
Thanks again
Karim
07-22-2010 06:30 AM
And Indeed we finally have to change the chassis.
Original log message was :
------------------------------------------------------------------------------
%PM_SCP-SP-2-LCP_FW_ERR_INFORM: Module 3 is experiencing the following error: Interrupt counters cumulative, (10s
critical/noncritical): ROINT[0]: totalcalls=1670, p2aecc1=157, p2necc1=677, ecc2=1168, ffifopb2ar=324, ffifopb2n=826, argospktin=1, pb2arinterm=405, (746/232). ROINT[2]: totalcalls=1, aricjacrc=1. JAINT[0]: total=83, drri0=83, (9/0).
%PM_SCP-SP-1-LCP_FW_ERR_POWERDOWN: Module 3 will be powered down due to firmware error: RO[0] (746 critical int in the last 10s).
%C6KPWR-SP-4-DISABLED: power to module in slot 3 set off (excessive interrupt)
------------------------------------------------------------------------------
After having testing with a new line module, in the same slot + another slot, they were going to PwrDown each time (when testing after in my Lab chassis all the line cards were OK), after some switchovers/reboots, the change of the chassis fixed the issue.
Regards.
Karim
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: