cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2895
Views
0
Helpful
5
Replies

Need help on Nexus Bizarre Issues

Hi,

I hope anyone can help as I am experiencing bizarre issues on Cisco Nexus 5548UP running version 5.1(3)N1(1) with Layer 3 daughter card. There are numerous issues, which I have experienced so far and would appreciate if someone can guide. I have never seen as such issues on Catalyst switches.

Issue 1:

L2 Port-channels between the two Nexus are failing. Nexus interfaces part of channel-group are being suspended. Please see the two Nexus configuration below and would like to highlight I have another pair of Nexus with the same configuration working without this issue so far (I have not rebooted them yet).

Nexus A

interface port-channel100

description LAYER2-INTERCONNECT NEXUS A TO NEXUS B

switchport mode trunk

no shutdown

interface Ethernet1/31

  switchport mode trunk

  channel-group 100 mode active

interface Ethernet1/32

  switchport mode trunk

  channel-group 100 mode active

Nexus B

interface port-channel100

description LAYER2-INTERCONNECT NEXUS B TO NEXUS A

switchport mode trunk

no shutdown

interface Ethernet1/31

  switchport mode trunk

  channel-group 100 mode active

interface Ethernet1/32

  switchport mode trunk

  channel-group 100 mode active

In addition, I have another port-channel configured on both Nexus as Layer 3 which does comes up something after rebooting the Nexus three to four times (looks like depends on the mood) and yes the feature lacp is ON. No matter whatever I try, the etherchannel never comes up. I have tried different ports and cable to rule out interface issues.

Issue 2

For some wired reasons, Nexus is putting a Trunk P2P link between the two Nexus (whereas one configured as Root Primary and second Root Secondary) into blocking estate and the only way to resolve the issue is by doing the reboot. However, there is a problem that sometimes Nexus will need two or three reboots in order to overcome the issue.

Issue 3

HP SAN nodes stop responding due to Nexus not able to learn the arp of the nodes. Whereas Nodes are not able to reach (ping) the gateway. Solution is rebooting and it does work after that.

Reboot is not the solution as I plan to use Nexus for SAN and routing between the DC and seriously need a solution.

Any help/advice will be much appreciated.

5 Replies 5

Oleksandr Nesterov
Cisco Employee
Cisco Employee

Can you please attach logs from your n5k devices related to these ports:

sh log log

Add network topology and both switch config here.

Also please do some tests to narrow down the problem:

1. disable lacp and check the links.

2. try to enable links between without bounding - check whether physical links are fine.

3. how SAN is connected - do you use any secondary ip ranges on interfaces?

Regards,

Alex

Please find attached Nexus A and B configuration. I managed to bring up the Channel-group using the same commands in attached configuration with addition of "spanning-tree port type normal". If the issue reappears, which I am sure it will because we will be rebooting the Nexus for failover testing, I will collect the logs for you. Trunk works if I use a single interface without channel-group however sometimes spanning puts the interface into P2P blocking even though it is point-to-point.

At present, my bigger concern is Adaptive load balancing on SAN nodes. We using Rapid Per-vlan spanning tree and having an intermittent drops when the HP Lefthand (10 Gbps) is running Adaptive load balancing. Majority of the time there is a packet drop until we shutdown either the Nexus A or B interfaces connected to nodes. It does work sometimes without a packet drop but as soon we restart Nexus A for failover testing, intermittent drop reappears. HP4300 is having issues in either way. We are still in testing phase and project deadline is getting closer.

SAN is using two IP addresses and after interface bonding (adaptive load balancing) it uses a single IP address.

I have run out of options, I have tried almost everything but issue persist. Is there a trick on Nexus to make the Adaptive load balancing work for HP SAN nodes? Or it could be the driver issue on SAN nodes? Our setup is quite straightforward but still having issues.

Lastly, it is all Ethernet, no FC or FCoE (it was not my decision)

Please find attached diagram as requested and many thanks for your help, much appreciated.

Hey Muhammad,

We use N5548up an Lefthand with 10gbit too. We have a drop issue with the Lehfthand too. What shows the command. Show queuing interface e1/x

When the issue is there

Or do you have a solution yet

I have a call open.

Regards

Rene

Sent from Cisco Technical Support iPhone App

Hi Rene,

Thanks for opening the case. Unfortunately, my company does not have an appropriate contract or I would have opened the case. Anyhow, I had to request our Server team to move to Active/Passive SAN Solution. HP recommends using Active/Passive when using 10 Gbit.

I have to say that Nexus 5548UP has disappointed me on number of occasions. It does not even support fragmentation and Jumbo frame support is not efficient. vPC works like a charm though. We tried a lot to find why Nexus in ALB use to go crazy after a reboot but failed to understand.

I would recommend using Active/Passive, default SAN mtu (1500) and flow-control send/receive. It will work without any issue. 

Please do let me know the outcome from TAC case.

Thanks

M D Khan

Rene Karcher
Level 1
Level 1

Hey,

The workaround is activ/passiv. Cisco fixed the bug, the next release should fix the bug.

Regards

Rene

Sent from Cisco Technical Support iPhone App