Cat 6500 w sup720 - SPANing all switch traffic - residual effects

Answered Question
Apr 8th, 2010

I have a request to send every data packet traversing heavily used distribution routers (DRs) out a SPAN session over a GbE port as a permanent configuration.

A typical DR is the default gateway for about 2000 hosts connecting across 40 vlans and 30 GbE dot1q trunks as downlinks to the L2 access switches.  Hardware is sup720 pfc3bxl, with multiple 16 port GbE / GBIC classic linecards, running 12.2(18)SXF (upgrading soon to SXI).

Other than the obvious oversubscription and resulting dropped traffic on the SPAN traffic, has anyone experienced any side effects by sending so much traffic / vlans out a SPAN session?   I'm thinking CPU / memory / other switch resources / etc.   Also worried about traffic being punted up to the RP, such as broadcasts, non-ip, etc.  From what I can gather, it doesn't seem that SPAN sessions on the c6500 architecture duplicate traffic.

I have not found any concrete restrictions or warnings of using the SPAN feature in this manner on the c6500 platform.  Any thoughts or experiences are appreciated.

Regards,

A Paradela

I have this problem too.
0 votes
Correct Answer by Giuseppe Larosa about 6 years 9 months ago

Hello A. Paradela,

I can confirm the issue was serious we had this high traffic 3 Gbps for hours and the problem was in pushing this mirrored traffic over a single GE port where a network IDS is connected.

>>Was it the RP's CPU that spiked? (sh proc cpu)

yes, the system became almost unreachable and user traffic was slowed down

>>Did you ever  escalate to TAC or look into it further to find the root cause?

We had opened several TAC issues for this webfarm regarding different aspects and one of this was related to this aspect.

>> CCO docs actually say there's no performance hit with SPAN on the 6500

I agree on this, but in our scenario we saw an issue and I've reported to you because you may have the same problem.

Hope to help

Giuseppe

  • 1
  • 2
  • 3
  • 4
  • 5
Overall Rating: 5 (1 ratings)
Loading.
lamav Thu, 04/08/2010 - 12:42

Hi, I think you are correct to consider the switch's resources when contemplating this set up.

I do wonder, however, what the efficacy is in taking such an approach. Why would you want to pound that port with so much information? I'm not sure I see the value in doing that. For one, you will have to consider the ability of the monitoring device to even be able to handle such a massive traffic dump from both a hardware and software perspective.

And then there is the human effort required to parse all that data and be able to draw intelligent conclusions about specific traffic flows, many of which may present themselves with missing data as a result of dropped packets.

I think you should take a more feasible approach than this.

Victor

paradela Thu, 04/08/2010 - 12:56

Victor,

I agree with you on many points, but I am being asked to set this up.  My demarc is at the switch, and I have no involvement with or input to the design of the device at the other end of the span session.  I'm trying to find any undesirable side effects to the performance or health of the switch before I configure the request.  Would rather avoid any surprises.  My gutt reaction is that I don't like it, that the SPAN feature is not meant for this purpose, but that's not enough to turn down the request at layer 8 and 9.

Regards.

lamav Thu, 04/08/2010 - 13:01

I understand the political component, especially when you have non-technical meatballs trying to put on a dog-and-pony for their bosses, who want to play "CYA."

That having been said, I dont specifically know what the ramifications would be of doing what you want, I am not sure anyone can answer that with absolute certainty. it is something you would have to test in a lab environment.

Instead of using the SPAN feature on the switch itself, they can consider investing in a probe appliance and installing it in series with one of the distribution layer's uplinks to the core, and perhaps between the L2 trunks between the access and distribution layers.

That's more along the lines of an acceptable solution when the requirement is permanent.

Victor

paradela Fri, 04/09/2010 - 05:57

Victor,

The request calls to view all data traffic on the edge vlans, so it has to be on the DR itself since it (the DR) will drop some traffic (broadcasts, invalid packets, etc.)    Also, the uplinks links to the core are running MPLS (LDP) so at that point the traffic is no longer tagged by dot1q which makes it difficult to trace back to where it comes from.

I did think of putting fiber taps on each downlink to the AS and use a different switch to aggregate the captured traffic, but with over 300 AS with dual uplinks each the cost and complexity would not be feasible.

I appreciate your input

Giuseppe Larosa Fri, 04/09/2010 - 00:07

Hello A. Paradela,

>> Other than the obvious oversubscription and resulting dropped traffic on  the SPAN traffic, has anyone experienced any side effects by sending so  much traffic / vlans out a SPAN session?   I'm thinking CPU / memory /  other switch resources / etc.

We have seen that if we try to push 3 Gbps of traffic over a single GE SPAN destination port sup720 cpu hits 100% and we had to disable it.

The device had become almost non responsive and it was near to be unmanageable.

We had to remove the SPAN session.

So your concerns are legitimate and if the aggregate traffic volume is high this configuration is not sustainable over long times.

our system is a sup 720 3BXL with IOS image:

sh ver | inc image
System image file is "disk0:s72033-adventerprisek9_wan-mz.122-18.SXF14.bin"

we have CSM and FWSM on it and we had an IDS on the destination port.

Hope to help

Giuseppe

paradela Fri, 04/09/2010 - 06:04

Hi Guiseppe,

Wow, I did not think it would get that bad.  A few questions.

Was it the RP's CPU that spiked? (sh proc cpu)

Did you ever escalate to TAC or look into it further to find the root cause?

Was the SPAN source a physical port or a VLAN, and how many of them?

Was the CPU load relative to the traffic load?

CCO docs actually say there's no performance hit with SPAN on the 6500, but also say that SPAN traffic competes with user traffic for all switch resources.

Thank you in advance.

Correct Answer
Giuseppe Larosa Mon, 04/12/2010 - 07:36

Hello A. Paradela,

I can confirm the issue was serious we had this high traffic 3 Gbps for hours and the problem was in pushing this mirrored traffic over a single GE port where a network IDS is connected.

>>Was it the RP's CPU that spiked? (sh proc cpu)

yes, the system became almost unreachable and user traffic was slowed down

>>Did you ever  escalate to TAC or look into it further to find the root cause?

We had opened several TAC issues for this webfarm regarding different aspects and one of this was related to this aspect.

>> CCO docs actually say there's no performance hit with SPAN on the 6500

I agree on this, but in our scenario we saw an issue and I've reported to you because you may have the same problem.

Hope to help

Giuseppe

Actions

This Discussion

Related Content