Does anyone know the best way to configure NIC teaming switchports for HP servers?
Currently we have a problem where servers are connected to the switches in a dual NIC fashion and are teamed. The server transmits out of both interfaces but receives on only 1 interface. This is set-up on the server side.
We have had complaints where if the servers are trying to do a file transfer to another server in the SAME vlan then it runs very slow, however if its to a different vlan its fine. What makes it even more wierd is that if you break the teaming i.e. only 1 NIC active, then its works fine, file transfer rates are normal.
Does anyone know what can be causing this behaviour or if it is at all a networking issue?
how do you have the team configuired? load balancing? active/standy
Do you have an etherchannel configure? if so what is your load balencing config - src - dst mac address, src dst IP aadress?
The team on the servers is configured as load-balancing. I've ben told it transmits out of both NIC's but receives on only one interface.
We have not configured etherchannel on the switch, as each NIC is connected to a different access switch. 1 port per switch.
There is your issue!
In suggest you configure an ether-channel to the server team NIC's - then you will have the benefit of 200mbs or 2gbs connection to that server.
Right now it will not work correctly for you in loadbalencing mode.
Thanks for your answer.
But the reason why we have each NIC on a different switch is for redundancy, if one switch fails the other NIC is still live.
If we connect both NICs to one switch with etherchannel then we no longer have this resiliency.
Ideally the server team want to connect each NIC to a different switch...
OK that's fine - but with that you CANNOT have the NIC's in loadbalencing mode can you, you have to have active/failover.
The sever will have a virutal IP address with a virutal MAC address. So how can the traffic pass on both switches using the same virtual IP address and the same virtual MAC address - it just does not work, the switches will keep getting confusing packets on where the traffic should be sent to which port.
Your comments make perfect sense Andrew, so thank you. Before I go back to the server team with these comments i just want to clear a few things.
1, These servers have worked fine the way they are for years, how is this possible?
2, They seem to only have the issue in data transfer slowness when the source and destination IP/MAC is on the same VLAN i.e. between the NAS and antivirus server both on vlan 30. When the source and destination is on DIFFERENT vlans for the data tranfer then the tranfer rate is fine i.e. NAS is on vlan 20 and AV server is vlan 30. How is this possible?
Thanks for you help, I really apprecaite it.
OK - let me ask you this:-
1) If 2 interfaces on the same switch have the same mac address and the same IP address - what is the switch going to so with a packet for the IP/MAC address?
If you have 2 switches - 1 is primary for layer 3 and the other is secondary for layer 3. If a packet comes into the layer 3 interface on the primary switch routed for the server vlan, and the primary NIC in the teamed pair is connected to the primary switch, it has an arp entry and a mac entry - it will work a treat.
if you primary layer 3 interface is on switch 2 - but the primary server NIC is on switch 1 - but switch 2 also has a arp/mac entry for a local port!
At the end of the day - inter vlan routing to that server will work OK. Same broadcast vlan/domain will have issues.
Thanks Andrew, you really know your stuff!
If I summarise the the best way forward from our discussion:
- A NIC team pair if configured in load-balance mode should be on the same switch, and configured with etherchannel on the switch.
- If the NIC pair go of to different access layer switches (1 NIC to switchA and 1 NIC to switchB) then they should be configured in an active/passive manner on the server, and on the switch no etherchannel is required - just a normal access port is fine.
Is this right??
Correct and correct!
But what you could do - is have server failover instead of NIC failover:-
server 1 - nic1 & nic2 into switchA in etherChannel.
server 2 - nic1 & nic2 into switchB in a etherchannel.
Server replication between the two servers - then if 1 whole server fails - the secondary server takes over.
I spoke to the server team with the suggestions you made but they directed me to a HP installation document which they got from HP when troubleshooting. On this doc it said if the server is configured in TLB mode (teamed load balancing), which they have, then it can most certainly be used on different access layer switches (1 NIC to each switch). HP said if it does not work then it is a problem with the switch configuration. I cant understand what that can be..
I would be interested to see that document - as I found on the cisco website this one:-
Funny - the config example actually uses HP Teaming!
Thats very interesting. I'll see if I can get hold of the doc, i should be able to. In the meantime I found this which tells us about TLB teaming:
and also this paragraph:
Transmit Load Balancing (TLB) - balances the transmit traffic among the team members, but does not require any special switch intelligence or switch configuration. In addition, TLB teams can be split across switches as long as all members are in the same layer 2 network. In TLB teams, receive traffic is not load balanced, but is received on a single team member. TLB is a standard feature of ProLiant Teaming.
Great - and as I was precisely saying - from the posted above "In TLB teams, receive traffic is not load balanced, but is received on a single team member. TLB is a standard feature of ProLiant Teaming"
There must be an issue with the NIC's, I have seen before on Dell server.....that the 2 NIC's were fighting to become the master of the team, traffic was slow as the MAC address & IP address was switching between the core switches.
I'm not sure whether it can be a problem with the NIC as they seem to have the same issue on several servers across different sites?
OK - so now I think you need to do 2 things:-
1) Test and debug with the server guys.
2) Test and debug with the server guys.
The above is so important - I thought I would mention it twice.
The first test I would do would be to choose a server and then disable one on the NIC's - run a test. Then re-enable the NIC and disable the other one - run a test.
See if the problem exists with 1 nic disabled? If not - the issue has to be with the team.
Sorry i forgot to mention i have already done the test where we diabled the team i.e. used only 1 NIC, and the servers worked fine this way. The problem only exists when they enable the team on the same vlan.
Just out of curiosity what debug commands are good to use in troubleshooting this kind of problem.
Firstly I would track down the mac address:-
sh arp | inc x.x.x.x
show mac-addresses table | inc xxxx.xxxx.xxxx
Then check on both switches the speed/duplex - check the ports for crc/input/output errors
Then I would read the document on the HP teaming
Hi Has there been any progress on this as I am getting a similar problem and was wondering whether some assistance could be provided:
I have a 2 x HP ProLiant servers connected over 2 x 3560 switches. These 2 x 3560 switches sit behind 2 F5's which load balance clients to the 2 servers based on number of connections. Spanning tree is blocking routes from 1 switch to the F5 so all traffic only goes through 1 switch up to F5. There is a port channel between the two switches.
Each server has 2 NICs and each NIC is connected to each of the switches. The server guy has setup HP NIC teaming TLB (Transmit Load Balancing) to team the 2 NICs on each server as 1 virtual. Now it does clearly state that each NIC has a different transmit MAC address but the virtual adapater obviously is using the primary NICs IP and MAC. The 2 servers are within the same Layer 2 vlan.
Wer have tried changing the team type but this has not made any difference. There are no special configuration on the switch ports. The server guy also has a Dell NIC teaming on 2 other servers sitting behind the F5s and this is working fine.
We tried disabling 1 NIC within the teaming but this did not make a difference. We have had to resort to not doing teaming and now everything is working fine. Your help would be appreciated as to what this could be.
Andrew, we have a similar setup but with two 3750 switches stacked. Our application is video surveillance, storage is iSCSI, no Vlans, all traffic including IP cameras on this network. We are seeing intermittent server connection losses, intermittent slow storage response, have not been able to see in switch logs. Will appreciate your insights into this.