I had a customer issue and I'm hoping for some insight.
They have a NIDS solution hanging off of a Cisco switch. They are utilizing the switch to the extent that it is not possible to set up any more VLAN sessions, if I understand their description. The NIDS is unable to handle traffic it's getting, even when throughput is far below its rating, and is failing to alert on hostile traffic.
We captured packets and there are a lot of duplicates flying around (bit for bit copies, not just TCP retransmits).
The Cisco onsites swear alternately that
1) they know of no conditions that could cause such a condition, and
2) the observed ~60% dupes was "normal" for any enterprise.
We spoke with the vendor and they said their NIDS solution would in fact choke on this kind of traffic since it will cause overutilization, effectively driving down the throughput from 2g/s to some absurd level like 100m/s.
I'm not a Cisco expert but I was hoping for some possible ideas, or ways to troubleshoot this further.
It's kind of a nonissue since the customer is going to go with an inline IPS at some point in the future, but I kept wishing we could find root cause and eliminate the problem once and for all. Any ideas?
what kind of traffic are we talking about? (multicast, broadcast, unicast?).
If you describe it as an issue i assume you are talking about unicast traffic. I also assume from your description that you see more than twice the same packet... am i right (if you just see the same packet twice and your SPAN session is configured for TR/REC then it is normal to see it- all L2 switched packets will be copied on reception and transmission on each VLAN)?
There can be a lot of causes for such issue- i remember two:
- overloaded mac-address table (not very likely i suppose, but i don't know your network)
- Unstable Spanning-tree: it happens with RSTP (i don't know what type of STP is running on your network), when some BPDU with TC bit set is sent on a particular VLAN, all the switches will clear the mac-address table (dynamic) in order to react to a spanning-tree rearrangement. When a new packet arrives to a vlan on a switch with its mac-address table clean will be copied to all interfaces assigned to that vlan, and you will see these copies on your SPAN session. I saw once this behavior due do some design error and was really incrementing a lot the amount of traffic being switched and the traces were very exausting to see.
I am not quite sure I understand your description (of their description...)
'They are utilizing the switch to the extent that it is not possible to set up any more VLAN sessions, if I understand their description.'
Duplicates packets would indicate an undetected Loop in the network, is this what you are saying or do you mean you are seeing the same traffic on multiple ports? If you are seeing duplicates on the same packet trace then I would think STP loop, if you are seeing the same traffic on multiple ports I would think flooding is more likely.
Well, I'm not a Cisco expert, so I don't know quite how to describe what they're telling me. We tried isolating the sensor to just one VLAN (rather than have it monitor several VLANs) and the problem went away. So I assumed it had something to do with the VLAN spanning setup, but the onsites claim that it is now impossible to reconfigure or add more spans.
How can I go about troubleshooting an STP loop? What data do I need to pull, and what should I look for? ...If this requires some remediation, I don't mind doing my homework. Learn something new every day, right?
I know exactly what you're seeing. I do a lot of traces and the first few times I saw this I thought I had a bad loop somewhere. I think this is what is going on (although I never took the time to verify or research on cco):
When you span, you are usually spanning in and out traffic so you can get both directions. When the switch receives incoming traffic on vlan 1 for another device on vlan 1, it must replicate that packet and send it back out to the destination device. So you are seeing incoming and outgoing traffic for every layer 2 packet.
It's easy enough to explain the layer 2 traffic but when it's a layer 3 switch then you'll see something else happen, depending on your configuration. The switch may replace the source mac address with its own and then send it back out on the same vlan to the destination.
If you spend time trying to figure out how to work around this and get a "clean" output without these dups please let me know - it's not something I'm going to spend time researching but I could certainly use the info to cut down on the size of my captures!
If you have a monitor session set up and the source is a vlan, you will get a copy sent to the span port when the packet enters the VLAN (rx) and another copy of the same packet when it exits and access port (tx). To get around this you can set the span session to only monitor rx traffic so you only get a copy of the packet as it enters the vlan or enters an access port but not both. This was an issue for me when setting up a monitor session for an IVR system and monitoring rx traffic only eliminated the duplicates.
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts The ProblemOn traditional
switches whenever we have a trunk interface we use the VLAN tag to
demultiplex the VLANs. The switch needs to determine which MAC ...
The ProblemEnter EVCsHow It Works (Ingress)How It Works
(Egress)Step-by-Step ExampleFinal Thoughts Introduction: Netdr is a tool
available on a RSP720, Sup720 or Sup32 that allows one to capture
packets on the RP or SP inband. The netdr command can be use...
IntroductionOSPF, being a link-state protocol, allows for every router
in the network to know of every link and OSPF speaker in the entire
network. From this picture each router independently runs the Shortest
Path First (SPF) algorithm to determine the b...