[Outages-discussion] [outages] CenturyLink peering issues?

Matthew Petach matt at petach.org
Sun Aug 30 14:36:28 EDT 2020


On Sun, Aug 30, 2020 at 10:06 AM Charles Sprickman <spork at bway.net> wrote:

> (on -discuss)
>

[...]


> Second thing, someone better versed in the intricacies of BGP better than
> I am, can you explain a bit about how we end up in a situation where L3
> continues advertising withdrawn routes for more than an hour?
>
> Thanks,
>
> Charles
>
>
Without naming a particular vendor, I'll relate a story from a few years
ago.

It involved BGP communities; at a particular company, we'd made *very*
extensive
use of BGP communities for identifying route type, route origin, backbone
nodes
the route had passed through, propagation scope, special handling rules;
all the
usual sort of stuff you really want to track through your network for
controlling
where and when routes are used.

That was all fine and good.  But, to make it all automatable and
scriptable, we'd
organized the BGP community number space as a bit field vector, with each
digit having a hierarchical meaning; think ASN.1 notation, but encapsulated
within BGP community values.  Wonderful for machine parsing and generating.
Tended to make for highly dense clusters within the BGP community number
space, however.

Turns out that wasn't great from the router vendor's perspective; they had
(erroneously) assumed BGP communities would generally be well distributed
through the number space, so doing a top level hash for storing
communities,
with a linked list in each hash bucket on the off chance more than one
value
hashed into the same bucket seemed reasonable.

And then it came face to face with our densely-clustered bit field vector
structure, and that hash function meant that there were a small number
of hash buckets with *very* long linked-lists of BGP communities in them
which had to be traversed for *every* route update.  It was taking *hours*
for BGP updates to get processed, as the router was applying the hash
function, hitting a particular bucket...and then walking a *very* long
linked
list of entries within that bucket.

The vendor in question captured a bunch of data, realized they needed
a different hash function that could deal with highly-clustered data like
this without overloading a few buckets, and built a new version of code
for us to deploy.

BGP update propagation went from hours down to minutes again.

I have no idea what the trigger issue for the CenturyLink issue of
today is; that will take a post mortem deep dive on their part to
identify.

But *how* can it happen?

Well, this story is one example of how it could (and did) happen in
a fairly large network a few years ago.   ^_^;;

Thanks!

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/outages-discussion/attachments/20200830/869b7d20/attachment-0001.htm>


More information about the Outages-discussion mailing list