[f-nsp] L4 CAM exhaustion

Fabio Mendes fabio.mendes at bsd.com.br
Fri May 21 09:09:23 EDT 2010


On Fri, May 21, 2010 at 4:36 AM, Jeroen Wunnink <jeroen at easyhosting.nl>wrote:

> Very familiar behaviour, cam runs out cpu kicks in to handle packets,
> something the Bigiron really isn't good at (read: sucks at) and connections
> start dropping, vlans flap on and off, sessions reset, etc..
>
> If I look at that output you attached (it shows plenty of info to debug
> this :-), L4 cam is low on several blades/slots (the number behind the
> 'free' is the magical number here you do not want to see at or near 0):
>
> L4 pool 0 index range:
>  (sw) 48640 - 49151    (0x0be00 - 0x0bfff), free 4 (0x00004)
>  (hw) 32256 - 32767    (0x07e00 - 0x07fff)
>
> Monitor if this is stable at 4 or hops up and down between 0 and
> whatever.., you have this on several slots/blades for the various L4 parts
> (there's even one at 0)
>
> L4 pool 3 index range:
>  (sw) 40960 - 43519    (0x0a000 - 0x0a9ff), free 0 (0x00000)
>  (hw) 24576 - 27135    (0x06000 - 0x069ff)
>
>
> So it seems you do a lot with ACL lists and these are using up a major part
> of your CAM, yet I see L2 and L3 CAM being happily in the 1000's free.., so
> I suggest you repartition your cam space to assign more L4 CAM and less
> L2+L3 CAM (this will require a reload of your Bigiron)
>
> Currently 25% of you CAM is designated for L4 purposes (Layer4 = 8192
> (0.5Mbits)    (25%)), so you need to increase this
>
> You can do this globally on your device when in config mode with:
> cam-partition l2 5 l3 20 l4 75
>
> This will assign 5% to L2, 20% to L3 and 75% to L4, see if this resolves
> your problems (make sure the L2 and L3 don't get too low either, you need to
> balance some magic between these numbers)
>
>
>
> On 5/20/10 9:04 PM, Fabio Mendes wrote:
>
>> Hello Guys,
>>
>> We are facing a strange situation on a customer.
>>
>> During random periods, BigIron syslogs this kind of message:
>>
>> May 20 15:24:13:I:System: CPU protection action2 deactivated at 000b7e00
>> May 20 15:23:48:I:System: CPU protection action2 activated at 000b7d00
>> May 20 15:23:22:I:System: CPU protection action2 deactivated at 000b7c00
>> May 20 15:22:57:I:System: CPU protection action2 activated at 000b7b00
>> May 20 15:22:31:I:System: CPU protection action2 deactivated at 000b7a00
>> May 20 15:22:05:I:System: CPU protection action2 activated at 000b7900
>> May 20 15:21:40:I:System: CPU protection action2 deactivated at 000b7800
>> May 20 15:21:14:I:System: CPU protection action2 activated at 000b7700
>>
>
> --
>
> Met vriendelijke groet,
>
> Jeroen Wunnink,
> EasyHosting B.V. Systeembeheerder
> systeembeheer at easyhosting.nl
>
> telefoon:+31 (035) 6285455              Postbus 48
> fax: +31 (035) 6838242                  3755 ZG Eemnes
>
> http://www.easyhosting.nl
> http://www.easycolocate.nl
>
>
>
Thanks for confirming I'm not crazy !

This is exactly what I've been saying to this customer: You're using to much
L4 entries, resize the CAM so L4 entries will be plenty.

But this customer (a VERY stubborn one, BTW) claims something like "but this
never happened before, how come this happens now ?"

I'll back there and try to take a more "psychological" than a technical
approach this time.

BTW, why the output does not show how much L4 entries in hardware are being
used ?

When it displays:

L4 pool 3 index range:
 (sw) 40960 - 43519    (0x0a000 - 0x0a9ff), free 0 (0x00000)
 (hw) 24576 - 27135    (0x06000 - 0x069ff)

sw means software and hw means hardware (duh!), but what is their practical
meaning anyway ?


-- 

CCNA - Cisco Certified Network Associate
CCNP - Cisco Certified Network Professional

"A bird that you set free may be caught again, but a word that escapes your
lips will not return." Jewish Proverb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://puck.nether.net/pipermail/foundry-nsp/attachments/20100521/b8365fa8/attachment.html>


More information about the foundry-nsp mailing list