[j-nsp] MTU changes affecting BGP sessions

Harry Reynolds harry at juniper.net
Wed Jul 26 10:56:17 EDT 2006

Sounds like the problem session is exceeding some path mtu resulting in discards (and session loss) when large route tables are transferred. IIRC the bgp mtu discovery option should result in packets with the DF bit set, but due to a bug this behavior only occurred on the initiator side of the connection; the receiver would use its interface MTU and not set the DF bit, which prevented accurate PMTU discovery and resulted in fragmented packets, which in this case were tossed due to a FW filter. I imagine a L2 switch would just chuck the jumbo frames with similar results, but not sure how PMTU is supposed to work lacking explicit icmp frag required error messages... Anyway, this problem is described in pr 67373.

Also, curious if you are using "system internet-options path-mtu-discovery"? This knob is used in conjunction with the BGP mtu option.

I would confirm the MSS negotiated for the problem session and go from there:

show system connections extensive | find <bgp-problem-session-address> | match mss

A fix for pr 67373 went into 7.5R3. Any chance you can upgrade to see if that resolves?

As FYI: The pr indicates that a learned MSS can persist for longer than 5 minutes, and that if you return the interface MTU the problem session may require deactivation for > 5 minutes before it will come up correctly. This delay allows the previous MSS value to age out so that the new interface MTU value will again be used to set the MSS.


> -----Original Message-----
> From: juniper-nsp-bounces at puck.nether.net 
> [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of 
> Raniery Pontes
> Sent: Wednesday, July 26, 2006 7:15 AM
> To: juniper-nsp at puck.nether.net
> Subject: [j-nsp] MTU changes affecting BGP sessions
> Hi everybody,
>   I´d like to know if someone has seen the following issue.
> I´ve got a M320 (Junos 7.5R2.8) with a few BGP peerings. 
> These peerings run on vlans over a L2 infrastructure built on 
> Ethernet switches (Cisco and Foundry). Some of these switches 
> are configured to support jumbo frames, around 9000 bytes, 
> some are not yet.
> Then I changed physical MTU in the M320 GigE interface to 
> 9000 bytes.  
> Most  peerings just flapped and came up fine, but one started 
> to flap continuously. This one stays up for about 90seg (bgp 
> holdtime?) and goes down again.
> I´ve tried "mtu-discovery" bgp command on my side, but didn´t 
> help.  No clue about the other side, so far.
> Did anyone see this kind of thing? Hints about a solution?
> Another interesting fact : there is another BGP peer 
> (multihop 2) in this same problematic vlan, and it works fine 
> after mtu change. But I´m almost sure it´s a Juniper, so no 
> interop issues ;)
> Raniery Pontes
> -----------------------------------------
> Log example:
> Jul 26 10:57:30  jm320_sp rpd[3001]: bgp_event: peer xxx (External AS
> xxx) old state OpenConfirm event RecvKeepAlive new state Established
> Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_traffic_timeout: 
> NOTIFICATION sent to xxx (External AS xxx): code 4 (Hold 
> Timer Expired Error),
> Reason: holdtime expired for xxx (External AS xxx), socket 
> buffer sndcc: 
> 0 rcvcc: 0 TCP state: 4, snd_una: 3621665251 snd_nxt: 3621665251
> snd_wnd: 16200 rcv_nxt: 3247700880 rcv_adv: 3247759208, 
> keepalive timer 0
> Jul 26 10:59:00  jm320_sp rpd[3001]: bgp_event: peer xxx (External AS
> xxx) old state Established event HoldTime new state Idle
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp

More information about the juniper-nsp mailing list