[c-nsp] NPE G1, CEF and ACLs and high CPU

David Hawthorne dhawth at bitgravity.com
Tue Sep 9 02:06:28 EDT 2008


On Sep 8, 2008, at 9:32 PM, Adrian Chadd wrote:

> Bill is practically right. The semantics for Cisco ACLs aren't  
> "here's a set
> of IP ranges, apply this behaviour", they're a linear walk of rules  
> from
> top to bottom applying behaviour at each step. Collapsing that into  
> the
> smallest set of possible operations is -not- taught at first/second  
> year
> computer science. Eg:
>
> * permit ssh 1.2.3.0/24
> * deny ssh 1.2.4.8/30
> * permit ssh 1.2.0.0/16
> * deny ssh 1.0.0.0/8
>
> Please yank the first year computer science curriculum bit which  
> provides
> the student with the clue required to algorithmically determine the  
> smallest
> set of permit/deny's keeping the above semantics correct. Then do  
> some basic
> analysis to find out what the resource bounds are on determining that.
> Oh, then prove that you can evaluate it in "more or less O(1) time".  
> Go on,
> I dare you :) (Note: I -think- the TCAM ACL rule programming in  
> later IOSes
> can do that? Perhaps Rodney or someone else from Cisco can  
> comment.. :)
>
> There's some rule folding (and compiling?) stuff that I've heard  
> about in
> later software-forwarding IOSes which attempt to mitigate this  
> somewhat but
> I've never really sat down and (ab)used it.
>

Nice.

I actually worked at $big_company during and after Bill's tenure  
there, and I had to step into his rather large boots developing the  
ACL system after he left.  By the time I was done, we actually had all  
of that, because we has run into issues with filling up TCAM on those  
6500s and needed to get some aggregation and cruft-removal done.  TCAM  
is used for quite a few things in IOS, as you're probably aware.  I  
couldn't get it any better than O(n!), and it took forever and a day  
to run against all of the ACLs, although that was due in part to the  
fact that Junipers allow multiple source and destination subnets and  
port ranges and protocols to have to test against.  The ACLs were also  
mind-bogglingly huge.  Cisco rules were like very skinny Juniper  
terms, so they went pretty quick.  It's surprisingly simple once you  
can get the ACLs parsed into a consistent data structure, assuming  
you're dealing primarily with the common case as given above and not  
dealing with special actions.

So it *has* been done.  Just not at Cisco.  Necessity is the mother of  
invention.  :p

btw, one of the surprising tricks we learned was that the range  
start_port end_port specification won't fill up TCAM on the 6500/7600  
IFF your port ranges fall on bit boundaries just like networks do.


More information about the cisco-nsp mailing list