Cisco Support Community
cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
Community Member

Nexus 1K VEM module shutdown (with DELL BLADE server)

Hello, This is Vince.

I am doing  one of PoC with important customer.

Can anyone help me to explain what the problem is?

I have been found couples of strange situation in a Nexus 1000V with DELL BLADE server)

Actually, Network diagram is like below.

스크린샷 2013-12-24 오전 3.47.37.png

I installed each two Vsphere Esxi on the Dell Blade server.

As Diagram shows each server is connected to Cisco N5K via M8024 Dell Blade Switch.

- two N1KV VM are installed on the Esxi. (of course as Primary and Secondary)

- N5K is connected to M8024 in vPC.

- VSM and VEM are checking each other via Layer3 control interface.

- the way of uplink's port-profile port channel LB is mac pinning.

interface control0

  ip address 10.10.100.10/24

svs-domain

  domain id 1

  control vlan 1

  packet vlan 1

  svs mode L3 interface control0

port-profile type ethernet Up-Link

  vmware port-group

  switchport mode trunk

  switchport trunk allowed vlan 1-2,10,16,30,77-78,88,100,110,120-121,130

  switchport trunk allowed vlan add 140-141,150,160-161,166,266,366

  service-policy type queuing output N1KV_SVC_Uplink

  channel-group auto mode on mac-pinning

  no shutdown

  system vlan 1,10,30,100

  state enabled

n1000v# show module

Mod  Ports  Module-Type                       Model               Status

---  -----  --------------------------------  ------------------  ------------

1    0      Virtual Supervisor Module         Nexus1000V          ha-standby

2    0      Virtual Supervisor Module         Nexus1000V          active *

3    332    Virtual Ethernet Module           NA                  ok

4    332    Virtual Ethernet Module           NA                  ok

Mod  Sw                  Hw     

---  ------------------  ------------------------------------------------ 

1    4.2(1)SV2(2.1a)     0.0                                             

2    4.2(1)SV2(2.1a)     0.0                                             

3    4.2(1)SV2(2.1a)     VMware ESXi 5.5.0 Releasebuild-1331820 (3.2)    

4    4.2(1)SV2(2.1a)     VMware ESXi 5.5.0 Releasebuild-1331820 (3.2)    

Mod  Server-IP        Server-UUID                           Server-Name

---  ---------------  ------------------------------------  --------------------

1    10.10.10.10      NA                                    NA

2    10.10.10.10      NA                                    NA

3    10.10.10.101     4c4c4544-0038-4210-8053-b5c04f485931  10.10.10.101

4    10.10.10.102     4c4c4544-0043-5710-8053-b4c04f335731  10.10.10.102

Let me explain what the strange things happened from now on.

If I move the Primary N1KV on the module 3 to the another Esxi of the module 4, VEM will be shutdown suddenly.

Here is sys logs.

2013 Dec 20 15:45:22 n1000v %VEM_MGR-2-VEM_MGR_REMOVE_NO_HB: Removing VEM 4 (heartbeats lost)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet4/7 is detached (module removed)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Ethernet4/8 is detached (module removed)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet1 is detached (module removed)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet17 is detached (module removed)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet9 is detached (module removed)

2013 Dec 20 15:45:22 n1000v %VIM-5-IF_DETACHED_MODULE_REMOVED: Interface Vethernet37 is detached (module removed)

....

2013 Dec 20 15:46:53 n1000v %VEM_MGR-2-MOD_OFFLINE: Module 4 is offline

If I wanna make it works again then I have to do two things.

First of all, It should be selected on the Source MAC Check the way of vSwitch's Load balance.

(Port ID check is the default)


Second of all, the the order of Switch's fail over is very important.

If I change this order then VEM will be off in very soon.

Here you go, the screen capture file of These option. (you may not understand these Korean letters.)

스크린샷 2013-12-24 오후 2.30.18.png

In my opinion, the main problem is the link part between Esxi and M8024.

As you saw, Each Esxi is connected to two M8024 Dell Blade switches separately.

I saw the manual for the way N1K's uplink Load balance.

Even though there are 16 different port-channel LB way,

but It should be used only the way of src-mac  If there is no supporting port-channel option in the upstreaming switches.

But I don't know exactly why this situation happened.

Can anyone help me how I make it works better.

Thanks in advance.

Best Regards,

Vince

937
Views
0
Helpful
0
Replies
CreatePlease to create content