[j-nsp] KRT Queue issue (was: Re: bfd = busted failure detection :)

David Ball davidtball at gmail.com
Wed Jan 6 11:26:03 EST 2010


  Ran into this again on a t640 running a 9.2SR release a couple days
ago.  KRT queue was 2-3k operations queued yesterday, is over 52k
operations queued now.  An RPD core dump was apparently the trigger
the LAST time this happened to us, but no core dump this time, so it's
starting to look a little more like the apparently random occurance
the rest of you have been experiencing.  It's with ATAC.

David B


2009/12/16 Richard A Steenbergen <ras at e-gerbil.net>:
> On Tue, Dec 15, 2009 at 11:03:08PM -0600, Kevin Day wrote:
>>
>> I went back and forth on this forever (pestering you while doing it),
>> because it was affecting us badly on old M20s. My "lab" boxes would
>> never ever show the problem, but it would happen in on the production
>> routers. I finally gave up and decided to figure out what the
>> difference was between my production configuration and the lab
>> simulation by slowly changing my production config to match the nearly
>> identical lab config.
>>
>> The problem went away when I removed a BGP session with a peer that
>> was extremely slow to accept routes, and we were exchanging full
>> tables with each other. I think it was some kind of deadlock where the
>> peer wasn't accepting routes because it was blocked trying to send me
>> stuff, and I was in the same boat. Snooping at the TCP layer, I didn't
>> see anything unusual except both peers ended up in a state where they
>> were advertising 0 window size to each other. The moment the KRT queue
>> cleared up, they finished exchanging routes and all was happy.
>>
>> I can't say for certain that was the problem, but shutting down that
>> peer was a pretty reliable way to clear the KRT queue problem whenever
>> it happened.
>
> What code was this? In theory shouldn't the routes be in a bgp queue
> regardless of whats happening with the tcp layer? Should see if we can
> reproduce this with modern hardware and code.
>
> --
> Richard A Steenbergen <ras at e-gerbil.net>       http://www.e-gerbil.net/ras
> GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
> _______________________________________________
> juniper-nsp mailing list juniper-nsp at puck.nether.net
> https://puck.nether.net/mailman/listinfo/juniper-nsp
>


More information about the juniper-nsp mailing list