cancel
Showing results for 
Search instead for 
Did you mean: 
cancel
870
Views
4
Helpful
4
Replies

PMTUD vs Handshake question

scott.bridges
Level 1
Level 1

Hello,

This is more of a fundamental question:

I understand that Path MTU Discovery (PMTUD) relies on setting the DF bit on packets and relying on ICMP Unreachables to scale down the MTU accordingly.

I also understand that the MTU of a given TCP stream is negotiated via the SYN/SYNACK/ACK three-way handshake between the two hosts.

 

I'm trying to grasp how it all inter operates. Is the PMTUD method only for UDP, or TCP too? Given that so many security devices block *all* ICMP, including Unreachables thereby disabling PMTUD, how does the Internet as a whole function? Just on the raw horsepower of routers fragmenting all of our transfers?

I've been doing some labs between two Windows hosts with a couple Riverbeds and a couple Routers in between. I see the TCP handshake and I see the Riverbeds optimize with the reduced MTU (and MSS) that I set, but I'm just not seeing how that value was communicated.

 

I'm going to do "no ip unreachables" and see what happens.

If anyone can help clarify, I'd greatly appreciate it.

Thanks

1 Accepted Solution

Accepted Solutions

Scott,

I understand the question now. The fact that the MSS is being rewritten on the routers and not the Riverbed doesn't affect the outcome. If you want to see how the MSS is being communicated, capture a SYNC or SYNC/ACK on both sides of the router and compare the MSS field on both captures. The capture closest to the server will have an MSS of what the server is advertising to the other device of the largest packet or segment that it is able to receive. The capture on the other side of the router will be the MSS that you set on the router interface. MSS rewrite like you are doing is usually needed when you are doing some form of tunneling on the WAN links to ensure no packets larger than the MTU is sent.

MSS rewrite changes the advertised largest packet size that the server says it can receive so they are at or below the MTU of the path between them. If the packet is never larger than the MTU, you won't see any ICMP unreachable messages. If you want to see the ICMP Unreachable messages, set the MTU on the router interface to a very low value and send a packet that is larger than it with DNF set. I would enable unreachables on the interfaces first.

Rewriting MSS on the path is proactive by making the servers think the other server advertised an MSS that is below the path MTU. This reduces the chance of PMTUD being invoked.

PMTUD is reactive and uses ICMP to inform the sending server to reduce it's packet size. It can do this multiple times until the packet is finally below the MTU.

HTH
Mark

View solution in original post

4 Replies 4

MARK BAKER
Level 4
Level 4

Hi Scott,

The Riverbed is rewriting the MSS that was advertised by the server. When you set the MTU on the Riverbed, it rewrites the MSS field in the TCP handshake sync and sync/ack packets. So if the server tries to advertise to the remote device that it can accept 1400 byte packets, the Riverbed between the two devices will overwrite it with the value that you configured like 1360 and forward it on to the destination with the new value in the TCP header. The remote device doesn't know the value was rewritten. I also believe the two end devices will agree on the lowest MSS value that was advertised or believed to be advertised to communicate with. You might have some issue if you don't rewrite it in both directions and PMTUD is not supported.

Personally, I always allow ICMP unreachable fragmentation needed but do not fragment set, time exceeded - TTL expired, and echo-reply. I don't think the internet blocks ICMP pass through traffic, but the networks you communicate with over the internet might.

You seem to already have a good understanding of it, so I'm not actually sure what you are missing.

 

serverA -MSS 1400 -> serverB (Sync)

serverB -MSS 1300 -> serverA (Sync/Ack)

IP packet size max would be 1300 in both directions I believe. I don't think it would be 1300 serverA to serverB and 1400 serverB to serverA.

 

serverA -MSS 1400 ->RB (MSS 1200 -> serverB (Sync)

serverB -MSS 1300 ->RB (MSS 1200 -> serverA (Sync/Ack)

IP packet size max would be 1200 in both directions

 

HTH

Mark

Hi Mark,

Thanks for taking the time to respond.

I apologize, but I believe I stated a confusing question in my setup.  In my setup, I did *not* edit any MTU values within the Riverbeds.

ServerA > Riverbed1 > Router1 > WAN Emulator < Router2 < Riverbed2 < ServerB

I've been setting "ip mtu" and "ip tcp adjust-mss" on the two Router interfaces facing the Emulator. 

Since this I've also included "no ip unreachables".  I've rebooted the Servers and they still TCP handshake at the correct minimum MTU/MSS as you've said. 

I guess I'm just confused on how the reduced MTU is still being communicated with unreachables disabled.  I'm expecting to see a TCP handshake with ICMP Packet Too Big packets being dropped, but I'm just not seeing them.

 

I believe I understand the Theory.  I'm just trying to see it with my own eyes.

 

Thanks again for your time!

Scott

Scott,

I understand the question now. The fact that the MSS is being rewritten on the routers and not the Riverbed doesn't affect the outcome. If you want to see how the MSS is being communicated, capture a SYNC or SYNC/ACK on both sides of the router and compare the MSS field on both captures. The capture closest to the server will have an MSS of what the server is advertising to the other device of the largest packet or segment that it is able to receive. The capture on the other side of the router will be the MSS that you set on the router interface. MSS rewrite like you are doing is usually needed when you are doing some form of tunneling on the WAN links to ensure no packets larger than the MTU is sent.

MSS rewrite changes the advertised largest packet size that the server says it can receive so they are at or below the MTU of the path between them. If the packet is never larger than the MTU, you won't see any ICMP unreachable messages. If you want to see the ICMP Unreachable messages, set the MTU on the router interface to a very low value and send a packet that is larger than it with DNF set. I would enable unreachables on the interfaces first.

Rewriting MSS on the path is proactive by making the servers think the other server advertised an MSS that is below the path MTU. This reduces the chance of PMTUD being invoked.

PMTUD is reactive and uses ICMP to inform the sending server to reduce it's packet size. It can do this multiple times until the packet is finally below the MTU.

HTH
Mark

Joseph W. Doherty
Hall of Fame
Hall of Fame

Disclaimer

The Author of this posting offers the information contained within this posting without consideration and with the reader's understanding that there's no implied or expressed suitability or fitness for any purpose. Information provided is for informational purposes only and should not be construed as rendering professional advice of any kind. Usage of this posting's information is solely at reader's own risk.

Liability Disclaimer

In no event shall Author be liable for any damages whatsoever (including, without limitation, damages for loss of use, data or profit) arising out of the use or inability to use the posting's information even if Author has been advised of the possibility of such damage.

Posting

PMTUD applies IP packets, i.e. not just UDP/TCP packets.  (Generally, only TCP will set the DF bit, other IP protocols just ignore avoid setting the DF bit to avoid fragmentation.)

I believe the original PMTUD RFC1063 was dependent on the source (when fragmentation needed ICMP received) used probing to determine the maximum MTU, but also believe the later RFC1191 (next-hop MTU) can inform the sender what's the maximum MTU that's supported.

How does the Internet function when security devices block some or all ICMP?  Not as well as it should, and some flows can be broken because of security devices blocking ICMP but things work often in spite of this because many IP protocols don't bump into MTU issues (i.e. they don't send that much data per packet), or because often the Internet supports max MTU for standard Ethernet.  Or course, fragmentation is possible, and then, indeed, you rely much on the performance of the router doing the fragmentation (not subsequent routers) and the performance of the destination doing the reassembly.

PS:

If you're looking at Riverbed devices, realize they might "play games" to optimize performance.

 

Review Cisco Networking products for a $25 gift card