cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2333
Views
5
Helpful
2
Replies

Nexus 1000V problem

Eugene Khabarov
Level 7
Level 7

Hello, All! Today I've spotted horrible issue with nexus 1000V. First of all here is our logs:

2012 Jan 18 10:45:28 core-nexus1kv-01 %SYSMGR-2-SYNC_FAILURE_MSG_PAYLOAD: vdc 1: Failure from active SUP

2012 Jan 18 10:45:28 core-nexus1kv-01 %SYSMGR-2-SYNC_FAILURE_MSG_PAYLOAD: vdc 1: Failure from active SUP

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 16 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 6 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 3 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 15 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 24 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 23 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 10 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 19 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 5 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 12 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 8 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 13 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 20 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 21 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 9 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 27 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 28 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 26 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 14 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 25 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 7 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 4 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 18 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 17 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 30 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 11 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 29 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VEM_MGR-2-VEM_MGR_REMOVE_UNEXP_NODEID_REQ: Removing VEM 22 (Unexpected Node Id Request)

2012 Jan 18 10:45:28 core-nexus1kv-01 %KERN-1-SYSTEM_MSG: Opcode 148505 throttled - kernel

2012 Jan 18 10:45:28 core-nexus1kv-01 %KERN-2-SYSTEM_MSG: mts_tcp_send_sync_msg(): TCP connection to standby is no longer established - kernel

2012 Jan 18 10:45:28 core-nexus1kv-01 %KERN-2-SYSTEM_MSG: do_xmit_mtsbuf_sync_msg_tcp(): Sync TCP send failed with error -70 - kernel

2012 Jan 18 10:45:28 core-nexus1kv-01 %KERN-2-SYSTEM_MSG: do_mts_standby_sync: failed to sync message, ha_stage 1, opc 148505, error -70 - kernel

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet16/1 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet16/2 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet18 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %ETHPORT-5-IF_DOWN_MODULE_REMOVED: Interface Vethernet18 is down (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet39 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet267 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet128 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet118 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet260 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet115 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet218 is detached (module removed)

2012 Jan 18 10:45:28 core-nexus1kv-01 %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet138 is detached (module removed)

Can anybody please explain what happened and why? Does it means what "split brain" issue occurred? What does this opcode value means?

As I know this was caused by some kind of storage system failure, but I have little details about this issue from out system administrators department.

2 Replies 2

Robert Burns
Cisco Employee
Cisco Employee

This doesn't necessarily mean split-brain has occurred, but it does show that the heartbeats between the VEMs and active VSM are not being received.

Which version of N1K are you running?

You mentioned you had a storage issue, has that be resolved?  Can you elaborate on the storage issue?

Also paste:

-VSM Running Config

-output of "show system redundancy status"

Regards,

Robert

Thank you for your reply.

After that we had another one similar problem, but it was not related to storage issue. The problem was with flapping link on upsteam switch.

This time actually the problem was with one of two blade switches behind Nexus, trunk ports to Nexus was configured with "spanning-tree portfast", but "trunk" option on both was not enabled and on one of them was configured with generic pvst  instead of rapid-pvst. This configuration issues was resolved so next time during link flap testing heartbeats was not lost betheen VEMs and VSM.

But first downtime is still under question...

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: