Fatal IPC error message on 3750 switch stack

Unanswered Question
Aug 21st, 2009

On one of our switch stacks we received the following log messages:

Aug 20 09:47:05: %XDR-6-XDRIPCNOTIFY: Fatal IPC error occurred for peer in slot

4. Message not sent due to timeout. Disabling linecard

-Traceback= AFFD58 21BCA4 BBCE14 BC1148 BC1874 BD3FC0 909AAC 90000C

Aug 20 10:00:05: %PLATFORM_RPC-3-MSG_THROTTLED: RPC Msg Dropped by throttle mech

anism: type 1, class 21, max_msg 8, total throttled 0

-Traceback= AFFD58 577E98 24F3EC 24F530 24FFA8 2501A0 909AAC 90000C

Aug 20 10:03:05: %PLATFORM_RPC-3-MSG_THROTTLED: RPC Msg Dropped by throttle mech

anism: type 3, class 21, max_msg 8, total throttled 1

-Traceback= AFFD58 577E98 24F3EC 24F530 24FF8C 2501A0 909AAC 90000C

We disconnected and reconnected the power cord on the member that was disabled to get it to restart. It re-joined the stack correctly and is functioning normally again. I could not find anything in the bug toolkit or release notes regarding fatal IPC problems on 3750 switches.

Has anyone seen these messages before on 3750s? Are there any thoughts as to whether this error condition was a fluke, whether it indicates hardware possibly going bad, whether it could occur as a result of some type of "bad" network traffic, etc?

I have this problem too.
0 votes
  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 0 (0 ratings)
Loading.
Joe Clarke Sun, 08/23/2009 - 10:05

What version of code is this 3750 running? How many switches are in this stack?

CHRISTINE BERNS Mon, 08/24/2009 - 05:37

The switches in the stack are running 12.2(25)SEE4, and there are 4 switches in the stack.

Joe Clarke Mon, 08/24/2009 - 07:22

I need the exact image name (i.e. show ver) to be able to properly decode the stack trace.

CHRISTINE BERNS Mon, 08/24/2009 - 07:30

Here's the image name:

System image file is "flash:c3750-ipbasek9-mz.122-25.SEE4/c3750-ipbasek9-mz.122-25.SEE4.bin"

Joe Clarke Mon, 08/24/2009 - 08:20

There are a few bugs that match the symptoms and stack trace, but without knowing the full config, these seem most likely:

CSCse51203

CSCsi74526

CSCsd26784

The minimum revision of code to run to get all three fixes is 12.2(40)SE.

This may also be CSCeg57839 if you have configured the unsupported "snmp-server ifindex persist" on the cluster. This command is not supported on 3750s, and should be removed.

CHRISTINE BERNS Mon, 08/24/2009 - 13:13

Thank you for the info, but none of the mentioned bug reports seem to fit the situation. We don't have arp inspection trust configured, we don't have the "snmp-server ifindex persist" configured, and no one was logged into the switch entering commands at the time the problem occured.

I'm not so concerned about the RPC throttled messages, I'm more concerned about the "Fatal IPC error occurred for peer in slot 4. Message not sent due to timeout. Disabling linecard" message.

Joe Clarke Mon, 08/24/2009 - 13:17

It could be a new bug then, or faulty hardware. I couldn't find any other issues that would match the version of code that you're running. If you can reproduce, I suggest you open a TAC service request so live troubleshooting can be done.

Actions

This Discussion