[j-nsp] MX10 - BGP and LDP sessions flapping without a reason

Niall Donaghy niall.donaghy at geant.org
Tue Nov 8 17:31:28 EST 2016


>Ah, glad you found some clues and hopefully have eradicated the problem.

Regarding Dragan's email:

>From: Dragan Jovicic [mailto:draganj84 at gmail.com] 
>Sent: 08 November 2016 22:02
>To: Niall Donaghy <niall.donaghy at geant.org>
>Cc: Alexandre Guimaraes <alexandre.guimaraes at ascenty.com>; juniper-nsp at puck.nether.net
>Subject: Re: [j-nsp] MX10 - BGP and LDP sessions flapping without a reason

>>We deployed a couple of MX80s in our network and had problems with route convergence, BGP stability, etc. in a full-mesh iBGP
>>scenario.
>>As L3 devices we found them unusable.

>I beg to differ. They are fine single PFE L3 boxes when deficiency such as poor BGP RIB-to-FIB issue is taken into consideration. As L3 boxes we found them very usable in well designed topology.
>But, to the @OP, you may want to read the following:
>https://kb.juniper.net/InfoCenter/index?page=content&id=KB26261&actp=search

>Also I found this useful (don't forget to turn it off).
>> set task accounting on
>> show task accounting
>> set task accounting off
>BR

Indeed, YMMV and clearly they're suitable in some applications. Not ours, in the end. 2.3M routes + MS-MIC + Netflow v9 were fine in lab conditions but not in production.

> -----Original Message-----
> From: Alexandre Guimaraes [mailto:alexandre.guimaraes at ascenty.com]
> Sent: 08 November 2016 22:28
> To: Niall Donaghy <niall.donaghy at geant.org>; juniper-nsp at puck.nether.net
> Subject: RES: MX10 - BGP and LDP sessions flapping without a reason
> 
> Niall,
> 	Thank you for your help, I will review carefully your consideration and apply them in a Maintenance Window, and keep looking...
> 
> 	But today, perhaps I find the problem, that is related to a large broadcast domain network of one customer. I saw some  tfeb
> information about a large udp/igmp flooding incoming in one interface, o saw it before, but don’t imagine that behavior was bad for the
> MX. Move the customer to a MX480, those flaps stops.
> 
> 	Again, thank you.
> 
> 
> Alexandre
> 
> -----Mensagem original-----
> De: Niall Donaghy [mailto:niall.donaghy at geant.org]
> Enviada em: terça-feira, 8 de novembro de 2016 19:41
> Para: Alexandre Guimaraes <alexandre.guimaraes at ascenty.com>; juniper-nsp at puck.nether.net
> Assunto: RE: MX10 - BGP and LDP sessions flapping without a reason
> 
> Have you ran 'show krt queue' and 'show krt state' during times of outage?
> 
> The control plane hardware on MX5 (or whatever you license it as) is puny - a Freescale e500v2 CPU @ 1.33GHz, 2.4DMIPS/MHz, therefore
> 3192 DMIPS (single core).
> We deployed a couple of MX80s in our network and had problems with route convergence, BGP stability, etc. in a full-mesh iBGP
> scenario.
> As L3 devices we found them unusable. As L2 devices they were fine.
> In particular, MX5/MX80 with MS-MIC is a bad combination - this would eventually peg the CPU and drop sessions, or crash the TFEB and
> core dump.
> 
> Other issues we encountered were a cosd memory leak that we never got to the bottom of.
> 
> Junos also runs some redundant processes by default which are not even applicable on MX5/80.
> You can disable these and gain some extra RAM:
> 
> [edit system]
>      +   processes {
>      +       cfm disable;
>      +       send disable;
>      +       ethernet-connectivity-fault-management disable;
>      +       ddos-protection disable;
>      +       ppp disable;
>      +       sonet-aps disable;
>      +       link-management disable;
>      +       iccp-service disable;
>      +   }
> 
> See also https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR1099523 - rpd.record file size/rotation issue if Junos <
> 14.1R6.
> 
> In our particular use case we removed the L3 terminations from the box and instead used l2circuits to haul them to nearby MX960
> routers which could handle the control plane load.
> As L2 termination devices they were fine, but I would be very reluctant to touch MX5/80 if I could avoid it!
> 
> Kind regards,
> Niall
> 
> > -----Original Message-----
> > From: juniper-nsp [mailto:juniper-nsp-bounces at puck.nether.net] On Behalf Of Alexandre Guimaraes
> > Sent: 08 November 2016 13:32
> > To: juniper-nsp at puck.nether.net
> > Subject: [j-nsp] MX10 - BGP and LDP sessions flapping without a reason
> >
> > Hi all,
> >
> > 	Did anyone experiencing something like this on MX10 without an obvious
> > reason- there has been no changes in network topology, all the interfaces are
> > up, no configuration changes has been done. There isn't anything useful in the
> > "show log messages" output. If I check the updates sent by BGP
> > peers, there is not excessive flood by none of the peers, BGP sessions flaps
> > randomly.
> >
> > 	Anyone seen such behavior before where RPD has high CPU utilization without a
> > clear reason? Is it somehow possible to trace the updates going to RPD in
> > order to understand better, what exactly RDP is doing at the time when the CPU
> > utilization is high?
> >
> > 	Jtac already working in on it to try to find the issue, but until now, I
> > think that they had no clue of whats going on.
> >
> >
> >
> > Model: mx10-t
> > Junos: 14.1R7.4
> > Hardware inventory:
> > Item             Version  Part number  Serial number     FRU model number
> > Midplane         REV 08   711-038213   xxxxxxxx          CHAS-MX10-T-S


More information about the juniper-nsp mailing list