cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
2164
Views
1
Helpful
22
Replies

Core Switch Migration

prekojo
Level 1
Level 1

We are trying starting a core switch migration from a 4507 to a 4510.  The two switches are links together by a vlan trunk that is also a port channel.  We also have a port channel running from the 4510 up to our UCS server environment.  All fiber uplinks for our user access switches are connected to the 4507.  Currently for our users to access all servers they hit the 4507, then cross the port channel, hit the 4510 and access all servers.  This works without any issues.  Our goal is to migrate all the user switch fiber uplinks from the 4507 over to the 4510 and remove the 4507.

We started this progress, but unplugging the first fiber uplink from the 4507 that connect to a user switch (3750), and plugging in into the 4510.  Less than a minute after plugging the fiber into the 4510, we start getting sporadic activity across all our vlans and some vlans we lose all connectivity.  If we try to reverse the change by plugging the fiber uplink back into the 4507, the network activity remains sporadic and connectivity to some vlans is still completely lost.  We have to bounce the port channel between the two switches for resume normal network activity.

Any ideas on why moving just one fiber uplink from a user access switch from 4507 to the 4510, would cause this outage.  Please see the attached diagram for environment details.

Thanks

22 Replies 22

Jon Marshall
Hall of Fame
Hall of Fame

When you disconnect the switch an STP TCN should be generated by the 4507 to inform the other switches to age out their mac address tables so this should not have created a problem eg. before disconnecting 4510 thinks all users connected to the 3750 is via the port channel but when you move the switch the 4507 knows needs to use the port channel to get to those mac addresses.

So that bit should have worked.

Which vlans stopped working and where are the SVIs for those vlans ?

Jon

VLANS 10, 406, 50, 58 each had sporadic activity.  Each of these vlans has a VM assigned to it on our UCS virtual environment.  Running a continuous ping to the servers during the time we move the fiber uplink from 4507 to the 4510, the ping starts dropping some packets once the cable is moved.  Moving the cable back, the some packets continue to drop until we bounced the port channel, traffic does not return to normal.

The SVIs for the vlans are configured on the 4507.  The 4510 has the vlans created, but no SVIs.

Joe

Joe

Not familiar with UCS etc. so i may be of limited use unfortunately.

The only thing i can think of is mac address tables not timing out properly on the 4500s and bouncing the port channel link sorted that out but you obviously shouldn't need to do this.

That would only affect devices connected to the 3750 switch though. Is this where you were running the continuous ping from ?

Jon

Jon,

I am running the ping from a server not attached to the 3750.  The 4507 has the load balance of the port channel configured to src-mac address and the 4510 has the load balance configured as src-dst-ip.  Do you think that could cause this issue?

Joe

Joe

What is the destination IP of the ping ?

Different load balancing methods on each end of the port channel should not as far as i know stop traffic, it will just affect how the links are being utiliised.

Jon

Jon,

Pinging from 172.30.9.68 to 172.30.12.6, 172.30.9.56, 10.30.8.2 and 192.168.33.51.  We lose the 192.168.33.x network completely, but other networks we drop some traffic.  There is no SVI for the 192.168.33.x network on either switch.  We just tag a few ports on the 4507 for this vlan and create a vlan interface on our firewall for this network.

Joe

Joe

I appeciate the diagram you posted was for simplicity but are all the user switches singly connected to the 4507 ie. there are no redundant paths ?

Were users unable to get to the firewalls ? 

It sounds like something wrong with STP but it's not clear how that could happen looking at the topology you have posted.

Jon,

Yes.  All the users are connected to the 4507 using a single path.  We were able to get to the firewalls and internet.

Joe

So was it just server connectivity that was affected ?

Jon

Jon,

Server connectivity to both physical and virtual servers.  I don't know that we tried pinging users desktops, so I am not sure if that was working or not.  I know that VOIP phones at the desktops could not register to the servers.

Joe

Joe

No problem. I appreciate when everything stops working it's difficult to run every test, you just need to get it back up and running as quickly as possible

Are all the servers connected to the same pair of switches ?

Jon

Jon,

I see you completely understand my situation. The virtual servers are connected to the 4510.  That is the only thing connected to the 4510 other than the port channel to the 4507.  The physical servers are connected to the 4507.

Joe

Jon Marshall
Hall of Fame
Hall of Fame

Joe

I'm not sure what is happening to be honest.

Moving one switch, singly connected should not do this to the entire network. I still suspect some issue with STP so it may be worth checking all switches and making sure they are all agreeing on which switch is STP root for the vlans.

I'm assuming the current 4500 is STP root for all vlans ?

Having to bounce the port channel suggests that the path was being blocked between the two 4500 switches or was not working as it should be. If you didn't move any SVIs etc. then i can only think it is down to a L2 issue not L3.

Trouble is i can't see any obvious loops in your network although as i said i'm not familiar with the VMware type setup so i can't say for sure.

As i say i would look at all the spanning tree outputs per vlan per switch to make sure it is all looking as it should do. Other than that obviously when you try it again schedule an outage and if you see the same symptoms -

1) look again at spanning tree to see if it is blocking anywhere it shouldn't be, especially on the interconnect between the 4500s.

2) if you have to bounce the port channel try it while the 3750 is connected to the new 4500 and see if that resolves the issue.

Sorry i can't be more help but if you want to run anything else by me then just let me know and i'll try and help.

Jon

Jon,

Thanks for all the information.  I have another scheduled outage tomorrow, where I will be doing additional troubleshooting with TAC.  Attached is the spanning root details from both core switches.  To me it looks like 4507 (coacoresw1) is the root bridge for all vlans.

Joe

Review Cisco Networking products for a $25 gift card