[c-nsp] TCAM troubles on 3750 stack

Sat Apr 29 02:24:13 EDT 2006

On (2006-04-28 21:49 +0200), Alexander Gall wrote:

> swiCP2#sh platform ip unicast route 
> Dumping IOS-HL3U Fib info
> Fib 0.0.0.0/0 Tbl:0 Bucket:0
>         Path(0)AdjIP:130.59.36.9 Vl:1006 000a.f330.1d80 RWI:0x2
>         HL3UFlags:0x28 COVERING FIB ADJ Failed 
>         SFT Entry:hdl:0x3C  HwFL:0x4
> [...]

Anything strange for 130.59.36.9? Did you have ARP for it? Was the MAC
programmed correctly? This appears to be L3 interface, not SVI, right?

> I don't know what exactly that means, but the effect was that all
> traffic to destinations reached by the default route (this router
> doesn't do BGP and uses a OSPF default route) was forwarded in
> software.  We're using the "desktop IPv4 and IPv6 default" SDM
> template and none of the TCAMs is full. I finally figured out that

How did you confirm that TCAM was not full?

> However, the other stack member has the same problem.  There would be
> some other arbitrary prefix (sometimes several of them) that got stuck
> in this state, e.g.
> 
> swiCP2#remote command 2 sh platform ip unicast failed route
> Total of 0 covering fib entries
> Entries covered by Actual default route(0.0.0.0/0)
>                   195.176.224.0/19 Tbl:0 : Cover:0.0.0.0/0 Tbl:0
>         Total of 1 entries covered by 0.0.0.0/0 Tbl:0

Is this really different route, or just optimization for TCAM? Could
this be just 'link' in hardware, to really use default route for the
more spesific? 

> switch.  No amount of mucking around with (d)CEF helped.  Anybody seen
> this or has any sort of clue what's going on?

Nope, never. I've had one issue with 3750 and stacking and I'm running
several 3750 stacks. I also run IPv6, BGP, ACL heavily with up-to 4
3750's stacked. The issue I'm talking was, was broken unknow unicast 
flooding for some VLANs, this happened after we had caused loop in the
L2 network (by help of of extreme). This was easy to confirm with
'show platform forward ..', it immediately displayed that the switch
would not flood unknown frame. If traffic was originating via other 
direction learning worked. TAC investigated the issue as 602520591,
but couldn't really help me with the issue. After some toying around
I found out that the problem was fixed by adding and removing unknown
unicast flooding, so perhaps it was feature that was triggered during
the L2 loop but bug that never allowed the switch to remove the
protection.

> For example, the MAC address of one of the affected hosts is
> 0003.ba9b.07bb.  The ethernet header of a packet captured with tcpdump
> shows 
> 
> ETHER:  ----- Ether Header -----
> ETHER:
> ETHER:  Packet 78 arrived at 18:33:6.55
> ETHER:  Packet size = 98 bytes
> ETHER:  Destination = 0:6:3:c8:a5:f8,
> ETHER:  Source      = 0:12:d9:ba:40:ca,
> ETHER:  Ethertype = 0800 (IP)
> ETHER:

Odd indeed, any chance that you've ad mistake in capturing or displaying
the capture? 0:60:3e would be cisco. Do you have any 0060.3ec8.a5f8 in
the network?

> Where the heck does 0:6:3:c8:a5:f8 come from?  This address doesn't
> exist anywhere in our network.  The arp cache and mac address table on
> the switch (master and slave) are OK

Did you try 'show mac-address-table' or 'show platform mac-addres-table'
to locate the 0006.03c8.a5f8?

> swiCP2#sh platform tcam table mac-address | inc BB
> 7      B0090003 BA9B07BB

I believe this command only displays L3 information. Use 'show platform
mac-address-table' not tcam table.

> But there's no trace of 0003.ba9b.07bb in the mac TCAM of switch #2
> 
> swiCP2#remote command 2 sh platform tcam table mac-address | inc BB
> Switch : 2 :
> ------------

I guess there isn't routed port for it in SW2? Do you see it in 'show
platform mac-address-table'?

> swiCP2#remote command 2 sh platform tcam table mac-address | inc 0006
> Switch : 2 :
> ------------
> 2      F00C0006 03C8C818
> 3      C00D0006 03C8C678
> 4      90090006 03C8AFB8
> 6      90090006 03C8B638
> 7      A0090006 03C8A5F8
> 9      F0090006 03C8C338
> 10     A0090006 03C89C38
> 11     A0090006 03C8B978
> 13     A0040006 03C8A2B8
> 14     90040006 03C89758
> 17     A0040006 03C8BCB8
> 18     90090006 03C89F78

Can you see these in 'sh arp'?

> You sure get a lot of complexity for your money with these boxes!
> BTW, the flash memory in one of these switches failed.  IOS appears to
> be unable to map out a single bad block and since the flash is
> on-board, we had to replace the entire box...

-- 
  ++ytti