[c-nsp] CoPP not catching software-switched CEF
Blake Willis
cnsp at 2112.net
Tue Dec 18 16:51:13 EST 2007
On Tue, 18 Dec 2007, Saku Ytti wrote:
Thanks for getting back so quickly with useful info, Ytti.
> How much is the 31% above baseline? If your baseline is above 0%, only mystery
> from my point of view we need to figure out is why you are software switching.
Baseline is usually around 5% or less. The vast majority of it is
usually IPSec AH, which I can understand why the PFC can't forward & needs to
punt. Perhaps the PFC3CXL is better at this? (not holding my breath though)
> match-all is not supported.
The config is loaded "class-map copp-ip" and the "match-all" is added by
the mucular QoS conflaguraterator by itsself. The docs (and most other examples
I've seen) seem to use "match-all". In general the CoPP filter in place has
usefully blocked plenty of other stuff in the past (mostly ICMP & UDP floods)
while preserving protocol traffic as normal:
suffering-bxl#sh mls qos ip | incl CPP|Dir
Int Mod Dir Class-map DSCP Agg Trust Fl AgForward-By AgPoliced-By
CPP 5 In pingme 0 3 dscp 0 39385972 0
CPP 5 In copp-ip 0 4 dscp 0 1073926 20623
> Do you run same CoPP config, in the previous non-affected PFC3x box?
Yep, CoPP (and the rest of the config) is as unified as possible on every box.
> remote command switch show tcam interface vlan <CoPP vlan> qos type2 ip
Thanks, that's really useful. The CoPP policy is loaded properly.
I've also found this to be helpful in the past (4087 being my CoPP vlan):
suffering-bxl-sp#show qm-sp index2label | incl 4087
index[4087] value[0x6001]
suffering-bxl-sp#sh qm-sp label 6001
SP policing bucket statistics:
2 min overruns/ag_id/bucket: lo[4/2/2920] hi[4/2/2920]
Labels:
Flow Policers:
Be careful with the qm-sp stuff though, the output is not paginated so
you can sit there for a while waiting for it to go by... It's too bad one can't
see something like "show qm-sp port-data" for the EOBC/IBC or CoPP as it's
really nice to be able to actually see the buffer sizes on physical ports.
BTW, after digging up the "7600 cheat-sheet" from the archives, the
answer is that soft-switched CEF is punted to the MSFC via the IBC interface,
and CoPP seems to be applied to the EOBC:
suffering-bxl#sh ibc | incl rate
5 minute rx rate 2082000 bits/sec, 2356 packets/sec
5 minute tx rate 4400000 bits/sec, 2473 packets/sec
suffering-bxl#sh eobc | incl rate
rx rate (bits/sec) = 2486000 rx rate (packets/sec) = 235
tx rate (bits/sec) = 119000 tx rate (packets/sec) = 228
Unlike the EOBC, the IBC doesn't seem to have an ifindex so you can't
graph it. Perhaps there's a MIB for this...
> I think it does, but personally, I don't care if it does or if it does not,
> since if you're software switching in MSFC3 you're dead with or without CoPP,
Shouldn't CoPP limit punts from the PFC before they hit the CPU?
Obviously, for software-switched CEF punted via the IBC, it doesn't.
> if you are not dead, buy cheaper faster software switching box. So at this
> time, be be more interested to find out, why it was software switched.
That's certainly the ideal way to avoid this specific incident
recurring, but as I blocked the attack long ago and have little idea as to what
was causing it to be punted in the first place (as the only thing that would
seem to cause traffic to be PFC-forwarded in one box and punted in another is
TTL=1, which is rate-limited to 500pps, and would be process-switched by the "IP
Input" process anyway so it could be seen occupying buffers etc.) BTW this is
what the bugger looks like:
Flow Record:
Flags = 0x00000000
size = 52
mark = 0
srcaddr = <snip>
dstaddr = <snip>
first = 1197914012 [2007-12-17 18:53:32]
last = 1197914095 [2007-12-17 18:54:55]
msec_first = 635
msec_last = 599
dir = 0
tcp_flags = 0x10 .A....
prot = 0
tos = 0
input = 56
output = 40
srcas = <snip>
dstas = <snip>
srcport = 0
dstport = 0
dPkts = 4582555
dOctets = 384934620
Unless or course the L3 etherchannel or POS card couldn't handle IP
protocol 0 traffic in hardware. BTW I'm set to "port-channel load-balance
src-dst-ip" and "mls ip cef load-sharing full simple", so my original idea of
"IP proto 0 srcport 0 dstport 0" traffic being punted because the etherchannel
hash goes bananas when that comes in may not apply as it shouldn't be looking
further into the header than the src/dest IP, but it may apply to MLS CEF.
Only one way to test it, but my spare POS card is in another country... I could
probably try with two outbound LAN ints set at a smaller MTU than the inbound
ints instead just to see what happens.
Only this one flow was soft-switched so it's likely not resource
exhaustion, especially as we make very light usage of acl/qos tcam on core
boxes.
suffering-bxl#sh tcam counts ip
Used Free Percent Used Reserved
---- ---- ------------ --------
ACL_TCAM
--------
Masks: 3 4087 0 72
Entries: 7 32719 0 576
QOS_TCAM
--------
Masks: 19 4074 0 18
Entries: 56 32685 0 144
suffering-bxl#sh mls qos free-agram
Total Number of Available AG RAM indices : 1023
Module [5]
Free AGIDs : 1018
Adjacencies look OK too (including in the CWAN card):
suffering-bxl#sh mls cef adj entr 311314 det
Index: 311314 smac: 00d0.0337.9c00, dmac: 0000.0910.0000
mtu: 4488, vlan: 4060, dindex: 0x0, l3rw_vld: 1
format: MAC_TCP, flags: 0x2000008408
delta_seq: 0, delta_ack: 0
packets: 3847042483, bytes: 733883009662
suffering-bxl#sh mls cef adj special | incl 0x2000008408
format: MAC_TCP, flags: 0x2000008408 (receive)
- - - - -
What I'm really interested in is finding out why CoPP didn't limit this
stuff & what I can do about it; we need a rate-limiter for general soft-switched
CEF netint traffic so we don't have to care too much about the trash-of-the-week
that gets forwarded in the netint path. I think that the CAR solution is
looking like the answer...
Thanks again for your help.
---
Blake Willis
Network Engineer
blake at 2112 dot net
More information about the cisco-nsp
mailing list