[c-nsp] Protecting a sup2/msfc2 in 2010: CoPP the "hard way"

Sun Mar 21 11:22:30 EDT 2010

Dear Cisco,

I've implemented my own form of "CoPP" on sup2/msfc2, using features of (18)SXF in a manner somewhat like the sup720. Could you please give us the same functionality a tad more directly? Clearly, there's stuff here that can mimic the essential behavior of formal CoPP (post-routing, pre-mls exception rate-limiter), without needing further shims, acls, or policers *after* receive adjacencies are matched. After all, cisco.com reveals that SXF will be supported until 2012 for the sup2; I think this is ample time!

-Tk

--------------

Dear List,

Find below a basic outline of how I approached the issue of inbound CoPP on the sup2. I hope folks minimally find it amusing, perhaps even useful in their environments. Let me know if this works for you, or completely destroys your border/core/dist/agg layers upon application. As always, use at your own risk, YMMV, AMF YOYO, BYOB, etc.

Best,

-Tk

--------------

Fellow geeks,

We thought the sup2 was doomed to a life of uninteresting or mundane tasks, and was stuck with an anemic, "woe-is-me!" control-plane which couldn't *really* be fixed or improved in any substantive way. I figured there had to be a way around this. Indeed, PFC2 supports some mls special-case rate limiters--unfortunately, they're pretty darn broad/general, which will cause trouble if you're running protocols in/out of the MSFC... and heck, who wouldn't be? I also like TELNET and SSH to work when the box is under attack.

I started with this basic mls config:

mls qos
mls rate-limit unicast cef receive 500
mls rate-limit unicast cef glean 1000
mls rate-limit unicast acl ingress 1000
mls rate-limit unicast acl egress 1000
mls rate-limit unicast l3-features 1000

But, as you likely know, this stomps on * and 0.0.0.0/0 heading towards MSFC; "bad." 

Since this happens after the PFC and EARL have had their chance to examine the packet, I figured it was simply too late to expect any sort of clever post-routing fix to happen. Frustrated, I thought to look earlier the forwarding chain, perhaps finding some way to get 'selected' packets to not hit the MSFC/EARL at all--and instead punt them to the RP cpu directly (or as close as possible to input-phase). This would then deliver packets to the RP before routing and before cef receive adjacencies could be matched & processed by the overly-broad mls rate-limiter. I figured if I could do that, I'd selectively avoid the mls stomp-stomp for specific, critical packets, while still retaining the "everything else == 500 pps!" mls rate-limiting functionality. 

I began reading here:

http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.2SXF/native/configuration/guide/qos.html#wp1671353

...and noticed that indeed, the pfc can create punted paths, well ahead of any routing activity. That got me thinking, "what will punt a packet for sure? ACL LOGGING!" 

I then tried a simple ingress ACL on a uplink-facing port, like so:

Extended IP access list proto-punt
   10 permit ip host xx.xx.128.65 any log
   20 permit ip host yy.yy.12.254 any log
   30 permit ip host yy.yy.12.253 any log
   40 permit tcp any any eq telnet log
   50 permit ip any any

...and sure as heck, several configured BGP neighbors' updates and telnet session traffic were nicely bypassing the MSFC and MLS rate limits, when the acl was activated inbound on a uplink port; helo pre-routing punts. But, this is fairly sub-optimal--I wanted something a tad cleaner, ideally with specific policers + acl's in the path towards the RP's precious 300 mhz mips. 

To that end, I needed a way to redirect packets sooner. What follows list is a great list of what you ought not never do on a 6500, which will most assuredly drain packets straight into parts of the router you ought not frequently traffic:

http://www.cisco.com/en/US/products/hw/switches/ps708/products_tech_note09186a00804916e0.shtml#acl

I'd like to highlight one line: "Policy-routed traffic, with use of match length, set ip precedence, or other unsupported parameters"

I banged' this out to test the theory:

route-map puntwhat permit 10
match ip address puntzilla
set ip precedence internet

Extended IP access list puntzilla
   10 permit ip host xx.xx.128.65 any
   20 permit ip host yy.yy.12.253 any
   30 permit ip host yy.yy.12.254 any
   35 permit tcp any any eq 22 (4 matches)
   40 permit tcp any any eq telnet (6 matches)

...after application of "ip policy route-map" on the inpu(n)t interface, we see that the PFC acl tcam was indeed getting programmed as we requested:

test6509#sh tcam interface fastEthernet 6/1 acl in ip 

* Global Defaults shared

   permit       ip any 224.0.0.0 15.255.255.255
   redirect     tcp any any fragments
   redirect     tcp any any eq 22 (match)
   redirect     tcp any any eq telnet (match)
   redirect     ip host xx.xx.128.65 any (match)
   redirect     ip host yy.yy.12.253 any (match)
   redirect     ip host yy.yy.12.254 any (match)
   permit       ip any any fragments
   permit       ip any any (match)

Perfect, mostly. 

I fired up several UDP and ICMP generators, aimed them at control plane addresses (bound to loopbacks, and other interfaces). After a half hour of ~90 megabits and 100 kpps, it was pretty clear the formula was working. 

While the attack was underway, ip input saw about 8% cpu load--recall that our mls configuration is permitting 500 pps of adjacency punts. I also noticed that my bgp sessions were still fine, even with 2 second helos, 10 second dead intervals, mind you. Of course they should be fine, the port wasn't line-rated, and there was no way to discard or miss the hello packets. Next step, clearly: refine/build narrower policer statements and couple these with a policy map to gain better "oh lordy, the NMS is DDoSing the MSFC" capability. Note the "microflow" policer; screw some "stomp all control-plane matched sources."

Here's my present working config. Please note the efficient re-use of copp-specific ACLs in the route-map and class-maps:

ip access-list extended bgp
permit tcp host xx.xx.128.65 any eq bgp
permit tcp host yy.yy.12.254 any eq bgp
permit tcp host yy.yy.12.253 any eq bgp

ip access-list extended snmp
permit udp host xx.xx.131.77 any eq snmp

ip access-list extended vty
permit tcp host qq.qq.195.69 any eq telnet
permit tcp host qq.qq.195.69 any eq 22

class-map match-any snmp
 match access-group name snmp

class-map match-any bgp
 match access-group name bgp

class-map match-any vty
 match access-group name vty

policy-map fake-copp
 class vty
    police flow 1000000 64000 conform-action transmit exceed-action drop
 class bgp
    police flow 5000000 64000 conform-action transmit exceed-action drop
 class snmp
    police flow 1000000 64000 conform-action transmit exceed-action drop

route-map fake-copp permit 10
match ip address bgp vty snmp
set ip precedence internet

...finally, apply em' to the interface with upstream/transit on it:

interface FastEthernet6/1
description Uplink
ip address vv.vv.96.146 255.255.255.252
no ip redirects
no ip unreachables
no ip proxy-arp
ip policy route-map fake-copp
load-interval 30
speed 100
duplex full
service-policy input fake-copp
hold-queue 1024 in
hold-queue 1024 out

...then, see if it's working summon the floods:

# cat /dev/zero | nc -u vv.vv.96.132 1234 &  

...from host xx.xx.134.82, which is not listed in any punt acls... what we get is ~500 pps of mls reply, which looks nice and lossy as you start flooding the control plane of the router:

64 bytes from vv.vv.96.146: icmp_seq=23198 ttl=241 time=74.032 ms
64 bytes from vv.vv.96.146: icmp_seq=23199 ttl=241 time=91.883 ms
64 bytes from vv.vv.96.146: icmp_seq=23200 ttl=241 time=44.694 ms
64 bytes from vv.vv.96.146: icmp_seq=23201 ttl=241 time=45.871 ms
Request timeout for icmp_seq 370
Request timeout for icmp_seq 371
Request timeout for icmp_seq 372
Request timeout for icmp_seq 373
Request timeout for icmp_seq 374
Request timeout for icmp_seq 375
Request timeout for icmp_seq 376
Request timeout for icmp_seq 377
64 bytes from vv.vv.96.146: icmp_seq=378 ttl=241 time=54.345 ms
Request timeout for icmp_seq 379
64 bytes from vv.vv.96.146: icmp_seq=380 ttl=241 time=55.394 ms
Request timeout for icmp_seq 381
64 bytes from vv.vv.96.146: icmp_seq=382 ttl=241 time=54.155 ms

Meanwhile, BGP and telnet/snmp are A-ok. Check out the nice ingress traffic going nowhere:

test6509#sis
 Interface               IHQ   IQD  OHQ   OQD  RXBS RXPS  TXBS TXPS TRTL
-------------------------------------------------------------------------
* FastEthernet6/1          0    26    0     0 88734000  7908  2000    3    0
* Loopback10               0     0    0     0     0    0     0    0    0

...and a sampling of non-zero cpu time, sorted processes:

test6509#ps
CPU utilization for five seconds: 19%/8%; one minute: 31%; five minutes: 28%
PID Runtime(ms)   Invoked      uSecs   5Sec   1Min   5Min TTY Process 
116       49520    104643        473  7.91%  9.70%  5.91%   0 IP Input         
  6      380252     19344      19657  2.63%  1.69%  1.31%   0 Check heaps      
303      167940    131174       1280  0.39%  1.56%  4.01%   0 BGP Router       
152       53032      8063       6577  0.07%  0.14%  0.44%   0 IP RIB Update    
 36       12408      1034      12000  0.00%  0.03%  0.00%   0 Per-minute Jobs  
 81      140084     37847       3701  0.00%  1.10%  2.51%   2 Virtual Exec     
151       67600      1905      35485  0.00%  0.20%  0.18%   0 IP Background    
 76        4016      6230        644  0.00%  0.01%  0.00%   0 Compute load avg 
166       11792     42288        278  0.00%  0.02%  0.01%   0 CEF process      
181         436      7090         61  0.00%  0.01%  0.00%   0 TTFIB XDR msgQ T 
304       11988     52418        228  0.00%  0.07%  0.18%   0 BGP I/O          
305     2572192    130472      19714  0.00% 10.28%  8.59%   0 BGP Scanner     

Now we're looking pretty *solid* ...that bgp aint going nowhere but ESTABLISHED! 

Note ip input chewing on some cpu, about 500 pps worth of udp segments, which clearly don't get handled by anything. The policy map is doing its thing, too:

test6509#sh policy-map interface 
FastEthernet6/1 

 Service-policy input: fake-copp

   Class-map: vty (match-any)
     927 packets, 61846 bytes
     30 second offered rate 2000 bps, drop rate 0 bps
     Match: access-group name vty
       927 packets, 61846 bytes
       30 second rate 2000 bps

   Class-map: bgp (match-any)
     7830 packets, 9471692 bytes
     30 second offered rate 28000 bps, drop rate 0 bps
     Match: access-group name bgp
       7830 packets, 9471692 bytes
       30 second rate 28000 bps

   Class-map: snmp (match-any)
     0 packets, 0 bytes
     30 second offered rate 0 bps, drop rate 0 bps
     Match: access-group name snmp
       0 packets, 0 bytes
       30 second rate 0 bps

   Class-map: class-default (match-any)
     321362 packets, 448351908 bytes
     30 second offered rate 11235000 bps, drop rate 0 bps
     Match: any 

...finally, a review of the port acl tcam programming appears as we'd expect:

test6509#sh tcam interface fastEthernet 6/1 acl in ip

* Global Defaults shared

   permit       ip any 224.0.0.0 15.255.255.255
   redirect     tcp host xx.xx.128.65 any fragments
   redirect     tcp host yy.yy.12.254 any fragments
   redirect     tcp host yy.yy.12.253 any fragments
   redirect     tcp host qq.qq.195.69 any fragments
   redirect     udp host xx.xx.131.77 any fragments
   redirect     tcp host xx.xx.128.65 any eq bgp (match)
   redirect     tcp host yy.yy.12.254 any eq bgp (match)
   redirect     tcp host yy.yy.12.253 any eq bgp (match)
   redirect     tcp host qq.qq.195.69 any eq telnet (match)
   redirect     tcp host qq.qq.195.69 any eq 22
   redirect     udp host xx.xx.131.77 any eq snmp
   permit       ip any any fragments (match)
   permit       ip any any (match)

With that, enjoy your newly CoPP'd sup2.