08-25-2011 06:10 AM - edited 03-11-2019 02:17 PM
Hello-
We have an ASA 5510 that has been in production for some time now and all has been well. Traffic on it has been increasing over time, but nothing outrageous. Two days ago we began taking MAJOR input errors (every single one is an overrun) on our inside interface. The errors come in LARGE lumps - 100k, 200k, 300k at a time. I have attached a summary of timestamps and input error counts to demonstrate what I am talking about.
"sh blocks" looks very good:
SIZE MAX LOW CNT
0 400 399 400
4 200 199 199
80 725 702 725
256 2412 2374 2411
1550 2932 2635 2673
2048 600 567 600
2560 900 899 900
4096 100 100 100
8192 100 100 100
16384 102 102 102
65536 16 16 16
"sh traffic" looks fine as well:
inside:
received (in 683.730 secs):
87210 packets 33517539 bytes
127 pkts/sec 49021 bytes/sec
transmitted (in 683.730 secs):
1979502 packets 243386175 bytes
2895 pkts/sec 355968 bytes/sec
1 minute input rate 138 pkts/sec, 101261 bytes/sec
1 minute output rate 2449 pkts/sec, 556063 bytes/sec
1 minute drop rate, 0 pkts/sec
5 minute input rate 127 pkts/sec, 64917 bytes/sec
5 minute output rate 1874 pkts/sec, 335035 bytes/sec
5 minute drop rate, 0 pkts/sec
"sh cpu" eliminates CPU hog as a potential issue:
CPU utilization for 5 seconds = 4%; 1 minute: 6%; 5 minutes: 6%
I cannot figure out how an interface that is moving only about 3000pkts/s can suddenly take 100,000+ input errors in a 1 second period?
So far we have:
- replaced the cable (three times)
- moved switch ports
- moved connection to another physical switch
- upgraded to ASA 8.2.5
Any thoughts?
Rick
Solved! Go to Solution.
08-25-2011 08:47 AM
Hi Rick,
Definitely we need to verify what traffic is this, whether it is normal network traffic or some malicious broadcast packet from any rogue machine. Captures would be the right option, along with the logs on the ASA, I guess we should follow that. Check whether it is any broadcast packets.
Thanks,
Varun
08-25-2011 06:18 AM
Hi Rick,
Can you provide the output of:
clear interface
and 3 outputs of show interface
clear asp drop
and
then show asp drop (3 outputs)
What effects did you notice due to these error on your network.
Thanks,
Varun
08-25-2011 06:51 AM
Ok, so I ran the clear int and then did show int three times at one minute intervals. I followed that with a clear asp drop followed by show asp drop at three minute intervals. I finished with two more show int.
The impact this issue is having on our network is horrible. Internet connectivity is horrific, and remote desktop usage is all but impossible.
I am noticing the 20k pps output rate from time to time. This is somewhat concerning because we do not have anything that should be generating that level of traffic. (Still, the ASA is supposedly rated for 190k pps, so this shouldn't be an issue). Further, the outside interface does not report anything near that level of traffic, which is even more puzzling. That interface (inside) does have several sub-interfaces each on their own VLAN.
(will post output in next message, getting error about "message cannot be displayed due to its content")
08-25-2011 06:55 AM
08-25-2011 08:30 AM
Hi Rick,
These stats are very huge, I would recommend you to check the amount of connections being built on the firewall, the overruns would only occur if the traffic hitting the firewall is far greter than the speed with which the ASA can process those packets.
What does the show conn and show conn count tell, are these numbers also very high???
What we need to identify is what traffic is this which is getting dropped by the firewall. There is also one new feature which was introduced in version 8.2.5, whihc is flow control, by default is disabled, it was introduced to better the performance handling in case of high traffic. Flow control is the process of managing the pacing of data transmission between two nodes to prevent a fast sender from outrunning a slow receiver, on the ASA we can try enabling the same feature. To enable this feature on ASA, here is the link to it:
http://www.cisco.com/en/US/docs/security/asa/asa82/configuration/guide/intrface.html
I am not sure whether this would alleviate the issue completely but overruns only encountered if the ASA is overwhelmed by the incoming traffic, so it fails to process those packets and report overruns on the interface.
Thanks,
Varun
08-25-2011 08:38 AM
Varun-
I will take a look at the link. Also, check out the attached image! It seems that things chug along nicely and then, suddenly, there is a HUGE traffic spike on the "inside" interface. When I say huge, I mean we go from an average of less than 250pps to over 11k pps and then immediately back down to normal. At the same time, this is when the input/overrun errors are logged. I have the graphs side by side.
So now the question becomes... what can possibly be generating this traffic. It is "outbound" from the inside interface. There is NO corresponding traffic on the outside interface in either direction. If the traffic didn't come in to the outside interface and it didn't come in to the inside interface, then how the heck is it being sent OUT of the inside interface. IE where is it coming from?
Perhaps I need to setup some packet caps and see if I can figure it out. Any other ideas? Is it possible that this is a failing NIC on the ASA?
Rick
08-25-2011 08:42 AM
Just a quick follow up. As we continue to watch, we have seen two spikes above 25k pps, and one as high as 60k pps. Again, no corresponding traffic on the outside interface.
This is all OUTBOUND from the inside interface.
It is almost as if the traffic is originating from the ASA itself!
Rick
08-25-2011 08:47 AM
Hi Rick,
Definitely we need to verify what traffic is this, whether it is normal network traffic or some malicious broadcast packet from any rogue machine. Captures would be the right option, along with the logs on the ASA, I guess we should follow that. Check whether it is any broadcast packets.
Thanks,
Varun
08-25-2011 09:30 AM
Hi Rick,
As Varun said you will need to see what is generating that amount of traffic in the inside of your network using logs and captures , also you can enable " ip verify reverse-path" on the asa so that it drops traffic that is being generated from inside network with bogus Source IP's.
Also, you should verify the speed/duplex setting on the interface on the asa and device connecting to it on the inside interface. I can see in your output , that you have speed/duplex hard coded on the asa.
Manish
08-25-2011 10:26 AM
Hi
May I suggest that you setup a sniffer and mirror the port to the asa and if you have the hardware also any port in the network. wireshark is a free, well working standard sniffer.
I am just concerned that it might be that you get a network loop from time to time somewhere and that that is causing the traffic to spike like that.
Good luck
HTH
08-25-2011 03:45 PM
Varun, others-
First I wanted to extend my sincere gratitude for your help in getting to the bottom of this issue. Capturing the packet data, I was able to identify THREE servers which were inexplicably sending out occasional, but very large, broadcast storms. The broadcasts were all Windows Browser Host Announcements. They were sent at 4 min, 8 min, and 12 min after bootup of the device and then again at 12 minute intervals. This is by design. However, instead of sending a single Host Announcement as they are supposed to, they were each sending some random number of packets in excess of 12,000. At times they would send as many as 60,000 packets! I suppose they REALLY wanted to make their presence known. LOL
The only thing in common with the three servers is that they are all Windows 2003 based. All have been up and in production for many years. Two were physical, one was virtual (a P2V). They all seemed to get sick at precisely the same time, which to me anyway, would indicate some sort of bug.
The ultimate fix was to simply disable NetBIOS on those boxes. Since then, no more traffic spikes, no errors on the ASA either.
I think the part that really confused me, and perhaps someone can shed some light on this, is that the ASA only showed outbound traffic on the inside interface. Why not inbound traffic, since the packets were broadcasts originating from somewhere else and destined for the inside network's broadcast address? Had the ASA shown these as "inbound" packets, I would have never suspected the ASA in the first place. I am clearly misunderstanding something...so always willing to learn something new if someone cares to explain it to me!
Thanks again and kudos to all for the help!
Rick
08-26-2011 12:24 AM
Hi Rick,
Thats really awesome you were able to nail down the issue. Those graphs with an exact periodic spike looked suspicious to me, so yes, capturing the traffic was some good job done by you
Coming back to your confusion, if those servers are located on the internal lan, then this traffic would definitely be outbound for the interface (leaving the inside interface), so you might be seeing it as outbound traffic. Or may be I could be wrong, because I am not really sure to which data are you pointing to? So can you shed some more light on it, with the help of the data?
Thanks,
Varun
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: