[c-nsp] transport path-mtu-discovery - ME3600....too unpredictable to use?

Wed Feb 24 07:47:09 EST 2016

On Wed, Feb 24, 2016 at 10:36 PM, Dragan Jovicic <draganj84 at gmail.com> wrote:
> Large packet reaching smaller MTU is simply discarded.

This is a very simplified view of the world.

The following is a description for IPv4. IPv6 however is very similar.

A router will try and forward a packet towards the destination. If the
"DF" bit is set, and the packet is too large to fit onto the
destination interface, the packet will be dropped, and an ICMP message
(Type 3 code 4) will be sent back to the originator telling the
originator that the MTU on this path needs to be reduced to at least
the size of the local interface onto which the router tried to forward
the traffic.

Unfortunately, this ICMP message does not always make its way back to
the originator, as many security "administrators" drop all ICMP
messages because they are evil

However, if the "DF" bit is not set, the router should fragment the
packet as required. This will generally cause the packet to be punted
to the CPU unless the hardware has special fragmentation support. This
is generally a very bad idea, and should be avoided at all cost.

> If you establish BGP session over large MTU links, and then session
> reroutes over smaller MTU link, you might see BGP Updates not passing
> trough - while session is established due to smaller sized keepalives.
> PMTU works only when establishing session. Therefore it is crucial to know
> all possible paths your session might pass trough.

PMTUD works the entire time during a TCP session, not only during TCP
establishment.
What you are probably referring to is MSS information which is
exchanged during the 3 way handshake.

> On Wed, Feb 24, 2016 at 12:13 PM, Dan Peachey <dan at peachey.co> wrote:
>
>> On 24/02/2016 10:33, Nick Cutting wrote:
>>
>>> Im an enterprise guy, so this is what I see for our clients.
>>> The exact path mtu discovery technique doesn't seem to be well documented
>>> - the RFC is 26 years old - and unless your hosts are actively "checking"
>>> mtu using DF bit the whole time a session is up (for changes)  - I don't
>>> believe even half of  RFC-1191 is implemented in the real world.

PMTUD is very old, but then again, so is IP.

>> They do appear to set DF bit on every IP packet (at least on a CSR1000v
>> this is true).

You will find that most applications/ operating systems set DF on all
UDP and TCP traffic.

>>
>> Packet dump shows DF being set on BGP keepalives:

>>
>> Whether they act on receiving an ICMP Frag Needed packet by decreasing the
>> MSS of the BGP TCP session is another question. When I get time I'll try
>> and test it.

As said earlier, MSS is only exchanged during 3 way handshake. On
receiving the ICMP frag needed message, the local OS will cache for a
period of ?5? min, the reduced MTU to this host. During this time, it
will send packets no bigger than new MTU - IP header size - TCP header
size to the destination. Once this timer expires, the MTU will be
increased again, and the process starts all over.

-- Andrew