Network is: - Router A------WAN--------Router B
Router A location is hosting the SAP Servers and clients are accessing them from Router B location. There is a GRE tunnel between Router A and Router B over the Service Provider WAN. Some other location over WAN also connects to SAP Servers over the GRE Tunnels
Problem: - 2 days before for some locations, my clients were not able to login to some of the servers and the session was hanging while giving the user credentials. But it was working for some Servers. Ping communication was OK for all the servers. I suspected the MTU issue and did the following steps
1. Gave the command ip tcp adjust-mss on GRE Tunnel on both sides (Problem resolved as MSS was reduced and big datagrams were avoided on the network)
2. Instead of above command I increased the GRE Tunnel Interface MTU from default of 1476 to 1500 (Problem resolved as now fragmentation started)
My Queries: -
1. If PMTUD is enabled on all Servers then why big datagrams were not traversing the WAN (MTU issues should be rectified automatically with PMTUD)
2. Also the PMTUD failure because of some blocking of ICMP message (Packet needs to be fragmented but DF set) should not happen as GRE tunnel clears the DF bit and the fragmentation can happen all along the way.
3. Why MTU issue was there for few Servers but not for all. And why issued started suddenly even after my network was working fine for 2 months.
Was it some Cisco bug or Server Issues??
as I understand it, enabling PMTUD on the servers is not sufficient when using a GRE tunnel. 'By default a router doesn't do PMTUD on the GRE tunnel packets that it generates'.
Can you tell if there was any change in the traffic patterns when the problem occured ?
Have a look at the document below:
IP Fragmentation and PMTUD
Let me start with the easy question. It is almost certainly not a Cisco bug. It is highly unlikely to be a server issue. It is most likely that someone made a change which broke PMTUD.
Your description of the problem indicates that there are two locations but then becomes ambiguous about whether there might be more. I can not tell from the description if users at one site were impacted but not users at other sites. Or were users at all sites impacted.
The problem arises from the fact that when you go through a GRE tunnel the GRE encapsulation adds an additional header to the datagram from the client or server. So if the client or server has sent a maximum size datagram then adding the GRE header produces a frame that is too large and requires fragmentation. But frequently the header has turned on the DF (Dont Fragment) bit so that the frame can not be fragmented. The fact that using adjust-mss to reduce frame size or increasing MTU on the tunnel interface resolved the problem confirms that it was the GRE MTU problem.
If PMTUD was working there should not be an issue. And if the network was working for 2 months I thinnk it is an indication that PMTUD was working. I believe that someone made a change which prevented PMTUD from working. Your post includes this statement:
should not happen as GRE tunnel clears the DF bit
This is actually not true. GRE does not clear the DF bit by default. GRE copies the control bits from the original frame into the new header. So if the original frame had DF set then the GRE frame will also have DF set.
Your third question is why some servers were impacted and some were not. I can only assume that some servers were negotiating an MTU smaller than 1500 and some were not.
Cisco documents says that by default in GRE DF bit will not be copied from the inner IP header to the outer (GRE + IP) header. You need to have tunnel path-mtu-discovery command on the GRE interface to make it enable. And this command is not present in my Scenario.
So PMTUD was only working till the GRE Tunnel interfaces. After that fragmentation should happen.
And my solution to the problem i.e. changing the Tunnel interface MTU to 1500 is also doing the fragmentation. So what is the difference between the two scenarios (One which was working for 2 Months and one which is the present case after changes)
So PMTUD failure should not be the cause of the problem.
The Route cause of the problem is still not clear.
Your interpretation of the documentation is clearly different than mine. The link that you included is a good writeup and I quote from it:
If the router participates as the forwarder of a host packet it will do the following:
* Check whether the DF bit is set.
* Check what size packet the tunnel can accommodate.
* Fragment (if packet is too large and DF bit is not set), encapsulate fragments and send; or
* Drop the packet (if packet is too large and DF bit is set) and send an ICMP message to the sender.
* Encapsulate (if packet is not too large) and send.
It seems to me to be pretty clear that if the original packet has DF set that the router doing GRE will discard the packet.
I continue to believe that the cause of your problem is that someone changed something that broke PMTUD.
I agree with the quote of your`s...Your quote comes into picture when Router is acting in first Role of forwarder.
Now See the quote in the document which says
Router is acting in the second role as a sending host with respect to PMTUD and in regards to the tunnel IP packet. This role comes into play after the router has encapsulated the original IP packet inside the tunnel packet.
Note: By default a router doesn't do PMTUD on the GRE tunnel packets that it generates. The tunnel path-mtu-discovery command can be used to turn on PMTUD for GRE-IP tunnel packets. (Check example-3)
So above quote comes into picture if GRE interface accepts the packet and then encapsulates it to make it of bigger size and then the outgoing interface fragments it (as now DF bit is not copied).
So conclusion is...If received packet size is bigger than 1476 bytes then ICMP error message is sent to the Sender of the packet. If received packet size is smaller than 1476, then GRE encapsulation makes it bigger clears the DF bit 0. This packet can be fragmented at the outgoing interface level if required.
And if PMTUD is broken somewhere then it should effect all the servers, not few ones.
I quote again from that link:
The router has two different PMTUD roles to play when it is the endpoint of a tunnel.
* In the first role the router is the forwarder of a host packet. For PMTUD processing, the router needs to check the DF bit and packet size of the original data packet and take appropriate action when necessary.
* The second role comes into play after the router has encapsulated the original IP packet inside the tunnel packet. At this stage, the router is acting more like a host with respect to PMTUD and in regards to the tunnel IP packet.
So the router has two different roles and it must play role 1 before it plays role 2. And in role 1 if the packet is max size and has DF set then the packet is dropped.
While I do not know enough about your specific situation to know exactly what the problem is, I do speak from experience with symptoms very similar to what you describe. I have had situations (several times) where PCs from a site could access some servers and could not access other servers. It became apparent that PMTUD was working to some sites and not working to other sites (or some servers had their MSS configured lower). Treating the problem as an MTU problem and reducing the MSS allowed all clients to access all servers.
I suggest that you configure the router with ip tcp adjust-mss specifying a smaller size (exactly how much smaller depends on specifics of your environment - in my case we were compensating for both GRE and for IPSec and used a frame size of 1375 which works well for us). If you try it, you may find the problem fixed. If it does not resolve the problem, then remove the configuration command.
My problem has already been resolved after increasing the Tunnel MTU size to 1500. Now any big size packet is getting fragmented just because GRE tunnel accepts the packet and then adds the 24 byte header and clears the DF bit (because tunnel mtu-path-discovery is not configured on GRE tunnel). After this it reaches the outside interface and get fragmented if required.
As per your quote below: -
The router has two different roles and it must play role 1 before it plays role 2. And in role 1 if the packet is max size and has DF set then the packet is dropped
My experience says: - In Role 1 if Router gets a big packet size with DF bit set then it drops the packet and sens the ICMP message to sender. Sender reduces the packet size and send it again. Router accepts it and adds the GRE header. This makes the packet size big and clears the DF bit. This packet can now be fragmented at later stage if required. This scenario was happening in my case also for 2 months.
So the present Scenario with tunnel size 1500 bytes and the earlier working scenario were similiar (Both doing the fragmentation if required)
And as per your earlier quotes that GRE tunnel copies the DF bit (doesn`t clears it), then howcome the present scenario is working. I mean then if the big size packet is hitting the router with tunnel size 1500 and GRE is adding the header to make it bigger than 1500 bytes with DF bit copied.(making the fragmentation not allowed).
You might want to have a look at the global command "ip icmp rate-limit unreachable". Default rate is 2 per second. This might not be enough for your network and thus destroy PMTUD for some servers/clients. The default rate is sat by Cisco to avoid DoS attack and is in my opinion very low. To my knowledge there is not any show or debug commands you can use to see wether you are affected by the rate. You might do some packet debugging whith ACLs and look for steady hits around 2packets/second.
Your problem descriptions seems to match this scenario so maybe its worth a try.
In case anyone is reading this in the present - Rick was incorrect, OP nailed it...
"When PMTUD is disabled, the DF bit of an external (encapsulated) IP packet is set to zero even if the encapsulated packet has a DF bit set to one."