[j-nsp] MX10 - BGP and LDP sessions flapping without a reason

Tue Nov 8 16:40:49 EST 2016

Have you ran 'show krt queue' and 'show krt state' during times of outage?

The control plane hardware on MX5 (or whatever you license it as) is puny - a Freescale e500v2 CPU @ 1.33GHz, 2.4DMIPS/MHz, therefore 
3192 DMIPS (single core).
We deployed a couple of MX80s in our network and had problems with route convergence, BGP stability, etc. in a full-mesh iBGP 
scenario.
As L3 devices we found them unusable. As L2 devices they were fine.
In particular, MX5/MX80 with MS-MIC is a bad combination - this would eventually peg the CPU and drop sessions, or crash the TFEB and 
core dump.

Other issues we encountered were a cosd memory leak that we never got to the bottom of.

Junos also runs some redundant processes by default which are not even applicable on MX5/80.
You can disable these and gain some extra RAM:

[edit system]
     +   processes {
     +       cfm disable;
     +       send disable;
     +       ethernet-connectivity-fault-management disable;
     +       ddos-protection disable;
     +       ppp disable;
     +       sonet-aps disable;
     +       link-management disable;
     +       iccp-service disable;
     +   }

See also https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1099523 - rpd.record file size/rotation issue if Junos < 
14.1R6.

In our particular use case we removed the L3 terminations from the box and instead used l2circuits to haul them to nearby MX960 
routers which could handle the control plane load.
As L2 termination devices they were fine, but I would be very reluctant to touch MX5/80 if I could avoid it!

Kind regards,
Niall

> -----Original Message-----
> From: juniper-nsp [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of Alexandre Guimaraes
> Sent: 08 November 2016 13:32
> To: juniper-nsp at puck.nether.net
> Subject: [j-nsp] MX10 - BGP and LDP sessions flapping without a reason
>
> Hi all,
>
> 	Did anyone experiencing something like this on MX10 without an obvious
> reason- there has been no changes in network topology, all the interfaces are
> up, no configuration changes has been done. There isn't anything useful in the
> "show log messages" output. If I check the updates sent by BGP
> peers, there is not excessive flood by none of the peers, BGP sessions flaps
> randomly.
>
> 	Anyone seen such behavior before where RPD has high CPU utilization without a
> clear reason? Is it somehow possible to trace the updates going to RPD in
> order to understand better, what exactly RDP is doing at the time when the CPU
> utilization is high?
>
> 	Jtac already working in on it to try to find the issue, but until now, I
> think that they had no clue of whats going on.
>
>
>
> Model: mx10-t
> Junos: 14.1R7.4
> Hardware inventory:
> Item             Version  Part number  Serial number     FRU model number
> Midplane         REV 08   711-038213   xxxxxxxx          CHAS-MX10-T-S