[c-nsp] Random BGP Drops

Catalin Dominte catalin.dominte at nocsult.net
Fri Jul 24 06:49:23 EDT 2015


Hi everyone,

Over the past two weeks we have been experiencing a few instances where
some BGP sessions drop randomly.

The router on our side is a 6500 Sup 2T XL version, with 1 x Full BGP
Transit, a few downstream customers and 30 BGP sessions at LINX, and OSPF
as the IGP. The setup has been very stable for the last couple of years
without any issues.

Looking at the logs on our side we can see the hold time expired.

On the customer side we can see the following message in their logs, in
particular the "hold timer remain" messages. The customer logs come from a
Juniper router and it clearly shows that there is still lots of hold time
remaining before the session should be torn down.

Jul 24 00:33:04  rt1 rpd[1396]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer
A.B.C.D (External AS *****) changed state from Established to Idle (event
RecvNotify) (instance master)
Jul 24 00:33:04  rt1 rpd[1396]: bgp_read_v4_message:10656: NOTIFICATION
received from A.B.C.D (External AS *****): code 4 (Hold Timer Expired
Error), socket buffer sndcc: 57 rcvcc: 0 TCP state: 4, snd_una: 3040466763
snd_nxt: 3040466801 snd_wnd: 16194 rcv_nxt: 3738492361 rcv_adv: 3738508724,
hold timer out 90s, hold timer remain 1:07.779687s
Jul 24 00:33:12  rt1 rpd[1396]: bgp_pp_recv: rejecting connection from
A.B.C.D (External AS *****), peer in state Idle
Jul 24 00:33:12  rt1 rpd[1396]: bgp_pp_recv:3286: NOTIFICATION sent to
A.B.C.D+29266 (proto): code 6 (Cease) subcode 5 (Connection Rejected)
Jul 24 00:33:36  rt1 rpd[1396]: RPD_BGP_NEIGHBOR_STATE_CHANGED: BGP peer
A.B.C.D (External AS *****) changed state from OpenConfirm to Established
(event RecvKeepAlive) (instance master)

Has anyone else seen the same error before?

Kind regards,

Catalin Dominte
Senior Network Consultant
+44(0)1628302007
Nocsult Ltd
www.nocsult.net


More information about the cisco-nsp mailing list