[c-nsp] Sup720 software forwarding

Peter Rathlev peter at rathlev.dk
Fri Mar 8 05:04:12 EST 2013


On Fri, 2013-03-08 at 09:08 +0000, Phil Mayers wrote:
> On 03/08/2013 07:46 AM, Peter Rathlev wrote:
> > Theoretically, if one would happen to have a Sup720 that does software
> > forwarding, how is it that one can check what the reason for punts is?
> 
> An excellent question. In the past, TAC have found stuff like this with 
> ELAM captures, but it's never been obvious to me how they inferred the 
> "cause" from the ELAM output; I assume they have a cheat sheet somewhere 
> with a bunch of hex values on it!

Yeah, I was hoping someone knew exactly what TAC does. :-) We've planned
a reload in 30 minutes and haven't had the time to involve our Cisco
partner (we're on shared support).

> Typically a linecard or chassis reboot is the only way to clear them; no 
> amount of "clear blah" will help, which is extraordinarily irritating...

We contemplated "clear cef table" but since we'd like to upgrade anyway
we're going with that.

> > ------- dump of incoming inband packet -------
> > interface Vl212, routine mistral_process_rx_packet_inlin, timestamp 00:00:33
> > dbus info: src_vlan 0xD4(212), src_indx 0x142(322), len 0x47(71)
> >    bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
> >    3C020400 00D40400 01420000 47080000 00060050 860FFF7C 00000000 03800000
> > mistral hdr: req_token 0x0(0), src_index 0x142(322), rx_offset 0x76(118)
> >    requeue 0, obl_pkt 0, vlan 0xD4(212)
> > destmac 00.00.0C.07.AC.02, srcmac 00.50.56.8A.49.15, protocol 0800
> > protocol ip: version 0x04, hlen 0x05, tos 0x98, totlen 53, identifier 6448
> >    df 1, mf 0, fo 0, ttl 128, src 10.83.12.125, dst 10.83.3.16
> >      tcp src 57615, dst 1531, seq 3896459602, ack 2194696301, win 509 off 5 checksum 0x849A ack psh
> 
> I assume Vl212 is unremarkable in configuration? And that 10.83.3.16 
> doesn't point back out of Vl212?

All interfaces are bog standard (to us) and similar to what we use on
other devices without the problem.

  interface Vlan212
   description AAR Wireless
   vrf forwarding RM03318
   bandwidth 2000000
   ip address 10.83.12.2 255.255.255.0
   ip helper-address 10.85.0.11
   ip helper-address 10.83.0.11
   no ip redirects
   no ip proxy-arp
   ip flow ingress
   ntp disable
   standby 2 ip 10.83.12.1
   standby 2 timers 1 3
   standby 2 priority 140
   standby 2 preempt delay minimum 20 reload 300
   standby 2 authentication *
   standby 2 track 1 decrement 50
   standby 2 track 5 decrement 50
   hold-queue 256 in
  end

The destination (10.83.3.16) is in another VRF and is forwarded towards
a firewall (via MPLS). Other packets staying within the same VRF are
also affected:

  interface Vl222, routine mistral_process_rx_packet_inlin, timestamp 00:36:12
  dbus info: src_vlan 0xDE(222), src_indx 0x142(322), len 0x5EE(1518)
    bpdu 0, index_dir 0, flood 0, dont_lrn 0, dest_indx 0x380(896)
    84020400 00DE0400 01420005 EE080000 00060050 860FFF7C 00000000 03800000 
  mistral hdr: req_token 0x0(0), src_index 0x142(322), rx_offset 0x76(118)
    requeue 0, obl_pkt 0, vlan 0xDE(222)
  destmac 00.00.0C.07.AC.02, srcmac 00.50.56.8A.00.86, protocol 0800
  protocol ip: version 0x04, hlen 0x05, tos 0x98, totlen 1500, identifier 2360
    df 1, mf 0, fo 0, ttl 128, src 10.83.22.7, dst 10.32.250.140
      tcp src 445, dst 49159, seq 991382863, ack 3459092047, win 63205 off 5 checksum 0xB698 ack

All CEF adjacencies seem sane AFAICT.

  Swouter#show ip cef vrf RM03100 adjacency Gi4/22.997 10.32.250.140 internal 
  IPv4 CEF is enabled for distributed and running
  VRF RM03100:
   3707 prefixes (3700/7 fwd/non-fwd)
   Default network 0.0.0.0/0
   Table id 1
   Database epoch:        2 (3707 entries at this epoch)
  
  10.32.250.140/32, epoch 2, flags attached, refcount 4, per-destination sharing
    sources: Adj 
    feature space:
     NetFlow: Origin AS 0, Peer AS 0, Mask Bits 24
    subblocks:
     Adj source: IP adj out of GigabitEthernet4/22.997, addr 10.32.250.140 54DB72C0
      Dependent covered prefix type adjfib cover 10.32.250.0/24
    ifnums:
     GigabitEthernet4/22.997(1384): 10.32.250.140
    path 52D8A730, path list 54CA603C, share 1/1, type adjacency prefix, for IPv4
    attached to GigabitEthernet4/22.997, adjacency IP adj out of GigabitEthernet4/22.997, addr 10.32.250.140 54DB72C0
    output chain: IP adj out of GigabitEthernet4/22.997, addr 10.32.250.140 54DB72C0

> Are they from a variety of input interfaces? Can you seen anything 
> common about the next-hops for things which are punting, versus things 
> that aren't punting?

It seems (though I'm not 100% certain) that only traffic forwarded via
the PFC is affected. We have a single 6708 card in the device and
transit traffic only touching this card doesn't seem to be affected.
This lends itself to the problem being a hardware programming error on
the PFC in my eyes.

-- 
Peter




More information about the cisco-nsp mailing list