[j-nsp] Slow performance of the KRT queue

Vincent Bernat bernat at luffy.cx
Fri Feb 5 15:47:11 EST 2016


Hey!

I have been helped off-list by Jeff who suggested to disable damping. I
had the following bits in my configuration:

set policy-options policy-statement v4-PUBLIC-DAMPING term 1 from route-filter 0.0.0.0/0 upto /16 damping damp-timid
set policy-options policy-statement v4-PUBLIC-DAMPING term 1 from route-filter 0.0.0.0/0 upto /21 damping damp-short
set policy-options policy-statement v4-PUBLIC-DAMPING term 1 from route-filter 0.0.0.0/0 upto /23 damping damp-medium
set policy-options policy-statement v4-PUBLIC-DAMPING term 1 from route-filter 0.0.0.0/0 orlonger damping damp-long
set policy-options damping damp-long half-life 30
set policy-options damping damp-long reuse 1640
set policy-options damping damp-long suppress 6000
set policy-options damping damp-long max-suppress 60
set policy-options damping damp-medium half-life 15
set policy-options damping damp-medium reuse 1500
set policy-options damping damp-medium suppress 6000
set policy-options damping damp-medium max-suppress 45
set policy-options damping damp-short half-life 10
set policy-options damping damp-short reuse 1000
set policy-options damping damp-short suppress 6000
set policy-options damping damp-short max-suppress 30
set policy-options damping damp-timid half-life 5
set policy-options damping damp-timid reuse 500
set policy-options damping damp-timid suppress 6000
set policy-options damping damp-timid max-suppress 20
set routing-instances public protocols bgp damping
set routing-instances public protocols bgp group v4-TRANSIT-ASXXXX-UPSTREAM import v4-BOGONS
set routing-instances public protocols bgp group v4-TRANSIT-ASXXXX-UPSTREAM import v4-PUBLIC-DAMPING
set routing-instances public protocols bgp group v4-TRANSIT-ASXXXX-UPSTREAM import v4-FROM-UPSTREAM

Disabling the damping part improved the situation sensibly. The default
redistributed in OSPF was updated 15 seconds earlier which is quite
nice.

I also had a similar blackhole problem when reenabling the upstream BGP
session. I didn't investigate much about this part as I thought this
would have been related to my first problem but disabling damping made
this problem disappear entirely.
-- 
She is not refined.  She is not unrefined.  She keeps a parrot.
		-- Mark Twain

 ――――――― Original Message ―――――――
 From: Vincent Bernat <bernat at luffy.cx>
 Sent:  3 février 2016 22:21 +0100
 Subject: [j-nsp] Slow performance of the KRT queue
 To: juniper-nsp at puck.nether.net

> Hey!
>
> I have a pair of MX104. Each one is receiving a full view and a default
> through an external BGP session. They share an iBGP session. They
> redistribute the default in OSPF (with a higher metric when the default
> comes through the iBGP session). Nothing fancy.
>
> If I shut the upstream port of one of the MX, the session goes down and
> the RIB is quickly updated. Unfortunately, the KRT is quite slow to be
> updated. A "show krt queue" shows there are many
> deletion/addition/changes queued and they take about 2 minutes to be
> processed.
>
> Unfortunately, during this time, I have a lot of more specific routes
> still pointing to a non-existant hop:
>
> vbe at net-edge004.dk2# run show route 138.231.136.1 extensive table public.inet.0 | no-more
>
> public.inet.0: 571546 destinations, 996364 routes (425305 active, 321183 holddown, 571058 hidden)
> 138.231.0.0/16 (2 entries, 1 announced)
> TSI:
> KRT queued (pending) change
>   138.231.0.0/16 -> {1.1.1.1}=>{indirect(1048578)}
> Page 0 idx 1, (group v4-IBGP type Internal) Type 3 val 22b9ccb8 (grp rto)
>    Advertised metrics:
>      No metrics
>      (Queued)
>    Enqueued metrics 1: (for peers 00000001 3.3.3.3)
>      Flags: Nexthop Change
>      Nexthop: Self
>      MED: 10
>      Localpref: 100
>      AS path: [61098] 25091 2200 2426 I
>      Communities: 25091:22413 25091:24115
> [...]
> Path 138.231.0.0 from 159.100.255.231 Vector len 4.  Val: 1
>         *BGP    Preference: 140/-101
>                 Next hop type: Indirect
>                 Address: 0x177743a0
>                 Next-hop reference count: 877603
>                 Source: 3.3.3.3
>                 Next hop type: Router, Next hop index: 1048577
>                 Next hop: 2.2.2.2 via xe-2/0/3.100
>                 Session Id: 0x18
>                 Next hop: 2.2.2.0 via xe-2/0/2.100, selected
>                 Session Id: 0x17
>                 Protocol next hop: 3.3.3.3
>                 Indirect next hop: 0x19ec4b2c 1048578 INH Session ID: 0x1b
>                 State: <Active Int Ext>
>                 Age: 16:57      Metric: 10      Metric2: 0
>                 Validation State: unverified
>                 Task: BGP_61098_61098.3.3.3.3+50640
>                 Announcement bits (3): 2-KRT 3-BGP_RT_Background 4-Resolve tree 2
>                 AS path: 8218 2200 2426 I
>                 Communities: 8218:102 8218:20000 8218:20110
>                 Accepted
>                 Localpref: 100
>                 Router ID: 3.3.3.3
>                 Indirect next hops: 1
>                         Protocol next hop: 3.3.3.3
>                         Indirect next hop: 0x19ec4b2c 1048578 INH Session ID: 0x1b
>                         Indirect path forwarding next hops: 2
>                                 Next hop type: Router
>                                 Next hop: 2.2.2.2 via xe-2/0/3.100
>                                 Session Id: 0x18
>                                 Next hop: 2.2.2.0 via xe-2/0/2.100
>                                 Session Id: 0x17
>                         3.3.3.3/32 Originating RIB: public.inet.0
>                           Node path count: 1
>                           Forwarding nexthops: 2
>                                 Nexthop: 2.2.2.2 via xe-2/0/3.100
>
> So, I have three questions:
>
> Is it expected for a route to be flagged "active" while it is still
> queued to KRT?
>
> Is there a way to delete those invalid routes in a more speedier manner
> to let packets use the default route during the convergence time?
>
> Is there some way to not advertise the default route in OSPF during the
> convergence time? Like a criteria: don't advertise this route when the
> KRT queue has 1000+ elements and until it reaches 0 (to avoid flapping).
>
> I am running 13.3R8.7.
>
> Thanks!


More information about the juniper-nsp mailing list