cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
499
Views
5
Helpful
11
Replies

Feedback on Network Redesign

caplinktech
Level 1
Level 1

We have been contracted by a small colo provider who essentially resells space in a cage at a larger data center facility to manage their network. I am having trouble understanding why certain things were done when initially setup and have my own thoughts on how things could be better setup, however, I am looking for some feedback from some experts.

First off, the network is essentially configured:

ISP 1 ISP 2

| |

3750----------3750

| \/ |

| /\ |

| / \ |

2950 2950

| |

Vlans Vlans

First thing I don't understand is that 3750s each define 127 (at the current moment) vlans and run HSRP to establish redundancy. With the stackability of the 3750s, wouldn't the same redundancy be provided by simply stacking the switches and ensuring that an uplink from the 2950s remain where they are (one to each physical switch)?

Assuming this was changed I am unsure how the routing may change. Currently, the company is assigned a /22 public space and have 2 /24 assigned to each switch for load balancing and static routes to handle inter-network routing. The upstream routes are handled with a /30 subnet assigned to each switch from the upstream and interfacing to one port on each switch. For an example of the way it is configured:

SW1:

int vlan 101

ip address x.x.232.2

standby 101 ip x.x.232.1

standby prempt

int vlan 150

ip address x.x.234.67

standby 150 ip x.x.234.65

ip route x.x.234.0 x.x.232.3

ip route 0.0.0.0 y.y.y.9

SW2:

int vlan 101

ip address x.x.232.3

standby 101 ip x.x.232.1

int vlan 150

ip address x.x.234.66

standby 101 ip x.x.234.65

standby prempt

ip route x.x.233.0 x.x.232.2

ip route 0.0.0.0 y.y.y.13

I am currently unsure how the DC is routing the public space down to their core. I don't think they are simply routing each /24 to the /30 ip of the correct switch as I do not think that would provide any redundancy. I guess my question here is what would be the best way to handle routing if the 3750 are stacked and keep redundancy in place? I could simply use the same /30s and use equal metric routes on the "virtual" switch, however is that the best way?

Thanks for any assistance.

1 Accepted Solution

Accepted Solutions

Hi,

Operational errors are very basic but since this technology or setup is new to most engineers...

- Replacing faulty switch

- Naming convention of the interfaces

- IOS upgrades/updates

Proper documentation, training, supervision, and change process will be able to address this.

I haven't experienced this operational errors yet, but I see a lot of posts in which other people has encoutnered this though following the procedure step-by-step.

Regards,

Dandy

View solution in original post

11 Replies 11

shivlu jain
Level 5
Level 5

actually 3750 supports 127 pvst so for this you should run mstp and if you want to utiliaze all the vlans means form 1 - 1000 then you can go with the glbp.

regards

shivlu

Joseph W. Doherty
Hall of Fame
Hall of Fame

Yes, 3750s could be stacked, and there are advantages of doing so, but perhaps one reason they are separate is you can do maintainence on each without disturbing the other.

The static routes between the 3750s is likely there if there's no L2 trunk between them.

If you stack the 3750s, you would no longer need the statics routes beyond the two default routes. Also, you might be able to cross member stack the links between the 3750s and the 2950s. (If you don't you'll need spanning tree.)

Danilo Dy
VIP Alumni
VIP Alumni

Hi,

3750 can be STACK but there are advantages and disadvantages.

Let's look at the disadvantages...take note that from your diagram, each switch is connected to different ISP (ISP1 and ISP2);

- You can't perform maintenance per switch. Any maintenance will be a total downtime, and internet connection will be lost.

- All member stack uses the MAC address assigned by the master switch. In case the master switch fails, the last switch will elect itself as master and assign new MAC addresses. Both upstream and downstream systems connected to the STACK will be affected by this change.

- I've seen many operation errors on people managing STACK.

I don't think the 3750 is receiving full BGP route from two ISP. It is advertising advertising the /22 to both upstream ISP and may only receiving partial route from them or just the default route.

I'm more worry about the throughput performance of the 3750. Imagine, it may be running the following;

- BGP

- VTP/STP

I won't be using 3750 as both core and distribution for a DC.

Regards,

Dandy

Good point about possible MAC change, but negated if you continue to use HSRP on the stack. Yes?

Also good point about operational errors. However, I don't believe the stack, itself, is really prone for making more errors, but when operational errors are made, you risk dropping the "one" critical device instead of there being two. (Of course, some operational errors can drop more than one device.)

Hi,

Can you elaborate on what you are referring to when you say operational errors?

Also, the ISPs are technically not 2 different ISPs, as mentioned the company basically rents a cage in a major DC and resells space in their cage to other companies. As such, ISP 1 and 2 are really simply redundant uplinks to the DC's core routers.

Regarding spanning-tree, the switches are currently setup for PVST which I was definitely planning on changing to MST. Basically not much of a choice on it, since the 3750 max out at 128 STI.

"Can you elaborate on what you are referring to when you say operational errors?"

The "fat thumb" variety. E.g. adding an ACL that's incorrect.

Hi,

Operational errors are very basic but since this technology or setup is new to most engineers...

- Replacing faulty switch

- Naming convention of the interfaces

- IOS upgrades/updates

Proper documentation, training, supervision, and change process will be able to address this.

I haven't experienced this operational errors yet, but I see a lot of posts in which other people has encoutnered this though following the procedure step-by-step.

Regards,

Dandy

Medan,

What type of maintenance other than potential IOS upgrades would actually require downtime of the switch stack?

As far as the MAC address change, I'm assuming the effect of this change would be the time for any arp adjustments/spanning-tree convergance on the network. I am not 100% certain, but I do not this time would be significantly more (if at all) slower than an HSRP failover.

As for throughput performance, I definitely do not see a problem. The switches are currently not running BGP. Currently the CPUs on each switch are averaging only around 11% with spikes to 20% for periods of less than 30 seconds and this is with 127 spanning tree instances and 127 HSRP heartbeats being monitored. Cutting the STI to 2 with MSTP and eliminating HRSP should greatly cut utilization.

The actual switching performed by the 3750 is lite at 40Mbps of external traffic passing between the DC core routers and the switches. Since each switch is only handling traffic sent upstream from the 2950 switches there isn't much traffic outside of what is leaving the network running through the switches.

Personally, I think the 3750s are way overkill for what is in use here, but they were already in place and I am just trying to efficiently manage their use.

Hi,

One of the maintenance other than IOS update/upgrades are replacing a faulty switch. Besides, on IOS updates/upgrades, the reboot of all member switch in the STACK is longer than the HSRP failover for non-STACK switches. Take note also of problems regarding IOS updates/upgrades in which the device is not sucessfully rebooted, if this happens to you it will be a longer downtime. If you really want to use STACK, I suggest that for any IOS update/upgrade, loan an exact model from your vendor and perform the IOS update/upgrade on them before doing it in production to minimize error (document the steps as this is the steps for your setup).

For MAC address change, take note of MAC address bugs on most old systems which is not able to identify MAC address change (it will take 4 hours if left unattended). i.e. Windows NT 4, Windows 2K (earlier SP), NOKIA/Checkpoint FW (2K and older), Cisco PIX (I think 6.x and older), Unix variance (I can't remember the versions but both Sun Solaris and HP Unix are affected)

I believe the throughput will degrade if more services is turned ON in the device. From what you have mentioned of performance (BGP is turned OFF), I think you are safe. Just watch closely the CPU, MEM utilization, packet drops, missed buffers. These are the indications of performance problem.

Regards,

Dandy

Hi Dandy,

I appreciate your input and sticking with me on this issue.

The stack reboot I agree would be longer than a HSRP failover, however, I am hoping that those are rare if they are ever needed at all.

Hi,

Yes, they are rare, but better to be prepared for it :). I also have prepared or preparing for it, making all possible problem scenario documented and how to address it, giving my team some dry run on lab environment to test the scenarios.

Thanks for the rate.

Regards,

Dandy

Getting Started

Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community:

Innovations in Cisco Full Stack Observability - A new webinar from Cisco