[c-nsp] 7204vxr freeze-up question

Adam Greene maillist at webjogger.net
Fri Aug 31 10:05:48 EDT 2007


Rodney,

Thanks. I appreciate the follow-up.

The show int was from g2/0 because it was originally freezing up while the 
card was in that slot. We moved it to g3/0 and it kept on freezing up. I 
took the show controller reading after we had moved it to that slot.

I can consistently trigger this error by performing the load test for about 
90 seconds or longer (enough time for traffic to build up to pretty high 
levels).

I have not been able to upgrade the router to 12.4(16) yet because I don't 
have a large enough flash card for the image. I still think that will be a 
good test.

Unfortunately, the router is in production, which limits my ability to 
perform testing. However, since the router was recently acquired, I have 
gotten a replacement 7204VXR from the vendor and will do some tests with 
that to see if the problem duplicates itself.

As soon as I have more results to share, I will.

Best regards,
Adam


----- Original Message ----- 
From: "Rodney Dunn" <rodunn at cisco.com>
To: "Adam Greene" <maillist at webjogger.net>
Cc: <cisco-nsp at puck.nether.net>
Sent: Friday, August 31, 2007 9:23 AM
Subject: Re: [c-nsp] 7204vxr freeze-up question


> You did  a sh controller for 3/0 but your 'sh int' was from
> 2/0.
>
> It's hard to know all those controller counters without going
> and looking at the code for that driver.
>
> But, suffice to say that the interface should never lock
> up and have to be bounced to forward traffic or receive traffic.
> If it does it's a bug.
>
> Now, also have to make sure the bounce isn't causing some
> other device to clear and not this one.
>
> I'd suggest capturing a 'sh controller' before a couple of times
> and then after we think it's "hung". Capture it multiple times
> after.
>
> Was this in a lab?
> Can you trigger it every time?
>
> Rodney
>
>
>
> On Wed, Aug 22, 2007 at 02:43:56PM -0400, Adam Greene wrote:
>> Here's output from a "sh controller" during the outage state:
>>
>> Interface GigabitEthernet3/0(idb 0x6363B6DC)
>> Hardware is WISEMAN 2.1, network connection mode is auto
>>   network link is up
>>   loopback type is none
>>   startup time: 176602 usec
>>   GBIC type is 1000BaseSX
>>   idb->lc_ip_turbo_fs=0x606372F4, ip_routecache=0x11(dfs=0/mdfs=0),
>> max_mtu=1528
>>   fx1000_ds(tx)=0x6363CE6C(0x6363CE6C),
>> registers(tx)=0x3D800000(0x3D800000), cu
>> rr_intr=0
>>   rx cache size=2000, rx cache end=1872, rx_nobuffer=0
>>  FX1000 registers:
>>   CTRL  =0x18180005, STATUS=0x0000000F
>>   FCAL  =0x00C28001, FCAH  =0x00000100, FCT   =0x00008808, FCTTV 
>> =0x000016E3
>>   RCTL  =0x00428032, RDBAL0=0x2000B000, RDBAH0=0x00000000, 
>> RDLEN0=0x00000800
>>   RDH0  =0x00000038, RDT0  =0x00000037, RDTR0 =0x00000000, IMS 
>> =0x000002D6
>>   TCTL  =0x000400FA, TIPG  =0x00A0080A, TQC   =0x00000000, TDBAL 
>> =0x2000C000
>>   TDBAH =0x00000000, TDLEN =0x00001000, TDH   =0x000000BA, TDT 
>> =0x000000BA
>>   TXCW  =0xC00001A0, RXCW  =0xCC0041A0, FCRTL =0x80001200, FCRTH 
>> =0x0000AFF0
>>   RDFH  =0x000014D7, RDFT  =0x000014D7, TDFH  =0x000003A7, TDFT 
>> =0x000003A7
>>   RX=normal, enabled  TX=normal, enabled
>>   Device status=full-duplex, link up, tx clock, rx clock
>>   AN status=done(RF:0 , PAUSE:3 ), SYNC'ed, rx idle stream, rx invalid
>> symbols,
>> rx idle char
>>  GBIC registers:
>>   Register 0x00:   01  07  01  00  00  00  01  00
>>   Register 0x08:   00  00  00  01  0D  00  00  00
>>   Register 0x10:   32  16  00  00  41  47  49  4C
>>   Register 0x18:   45  4E  54  20  20  20  20  20
>>   Register 0x20:   20  20  20  20  00  00  00  00
>>   Register 0x28:   51  46  42  52  2D  35  36  38
>>   Register 0x30:   39  20  20  20  20  20  20  20
>>   Register 0x38:   30  30  30  30  00  00  00  58
>>   Register 0x40:   00  1A  00  00  30  31  31  30
>>   Register 0x48:   31  36  30  38  32  36  34  31
>>   Register 0x50:   38  36  34  35  30  31  31  30
>>   Register 0x58:   31  36  30  30  00  00  00  D8
>>   PartNumber: QFBR-5689
>>   PartRev: F
>>   SerialNo: 0110160826418645
>>   Options:  0
>>   Length(9um/50um/62.5um): 000/500/220
>>   Date Code: 01101600
>>   Gigabit Ethernet Codes:  1
>>  PCI configuration registers:
>>   bus_no=6, device_no=0
>>   DeviceID=0x1000, VendorID=0x8086, Command=0x0116, Status=0x0200
>>   Class=0x02/0x00/0x00, Revision=0x03, LatencyTimer=0xFC, 
>> CacheLineSize=0x10
>>   BaseAddr0=0x49000004, BaseAddr1=0x00000000, MaxLat=0x00, MinGnt=0xFF
>>   SubsysDeviceID=0x1000, SubsysVendorID=0x8086
>>   Cap_Ptr=0x00000000  Retry/TRDY Timeout=0x00000000
>>   PMC=0x00210001  PMCSR=0x00000000
>>  Software MAC address filter(hash:length/addr/mask/hits):
>>  need_af_check = 0
>>   0x00:  0  ffff.ffff.ffff  0000.0000.0000         0
>>   0xC0:  0  0100.0ccc.cccc  0000.0000.0000         0
>>   0xD0:  0  0007.8420.e854  0000.0000.0000         0
>>  FX1000(type=0x98) Internal Statistics:
>>   rxring(128)=0x2000B000, shadow=0x6363D310, head=56, rx_buf_size=512
>>   txring(256)=0x2000C000, shadow=0x6363D53C, head=186, tail=186
>>   tx_int_txdw=0, tx_int_txqe=0, rx_int_rxdmt0=0, rx_int_rxt0=0
>>   tx_count=0, txring_full=0, rx_max=0, filtered_pak=0
>>   rx_overrun=0, rx_seq=0, reg_read=0, reg_write=0
>>   rx_count=128, throttled=1, enabled=1, disabled=1
>>   rx_no_enp=0, rx_discard=0, link_reset=0, pci_rev=3
>>   tbl_overflow=0, chip_state=2, tx_nonint_done=0, tx_limited=0
>>   reset=5(init=0, check=0, restart=4, pci=0), auto_restart=1
>>   tx_carrier_loss=1, fatal_tx_err=0, tx_stucks_count=1
>>   isl_err=0, wait_for_last_tdt=0, ctrl=18800005, ctrl0=18900005
>>   rx_stucks_count=2, rdtr_fpd=3
>>  HW addr filter: 0x6363DD68, ISL disabled, Promiscuous mode multicast
>>   Entry= 0:  Addr=0007.8420.E854
>>   Entry= 1:  Addr=0000.0000.0000
>>   Entry= 2:  Addr=0000.0000.0000
>>   Entry= 3:  Addr=0000.0000.0000
>>   Entry= 4:  Addr=0000.0000.0000
>>   Entry= 5:  Addr=0000.0000.0000
>>   Entry= 6:  Addr=0000.0000.0000
>>   Entry= 7:  Addr=0000.0000.0000
>>   Entry= 8:  Addr=0000.0000.0000
>>   Entry= 9:  Addr=0000.0000.0000
>>   Entry=10:  Addr=0000.0000.0000
>>   Entry=11:  Addr=0000.0000.0000
>>   Entry=12:  Addr=0000.0000.0000
>>   Entry=13:  Addr=0000.0000.0000
>>   Entry=14:  Addr=0000.0000.0000
>>   Entry=15:  Addr=0000.0000.0000
>> FX1000 Statistics (PA3)
>>   CRC error        0             Symbol error     0
>>   Missed Packets   0             Single Collision 0
>>   Excessive Coll   0             Multiple Coll    0
>>   Late Coll        0             Collision        0
>>   Defer            497           Receive Length   0
>>   Sequence Error   0             XON RX           0
>>   XON TX           0             XOFF RX          0
>>   XOFF TX          0             FC RX Unsupport  0
>>   Packet RX (64)   52            Packet RX (127)  289
>>   Packet RX (255)  0             Packet RX (511)  5
>>   Packet RX (1023) 0             Packet RX (1522) 433425
>>   Good Packet RX   949328        Broadcast RX     46180
>>   Multicast RX     32953         Good Packet TX   0
>>   Good Octets RX.H 0             Good Octets RX.L 657160659
>>   Good Octets TX.H 0             Good Octets TX.L 334817282
>>   RX No Buff       0             RX Undersize     0
>>   RX Fragment      0             RX Oversize      0
>>   RX Octets High   0             RX Octets Low    657160659
>>   TX Octets High   0             TX Octets Low    334817282
>>   TX Packet        237515        RX Packet        433771
>>   TX Broadcast     18            TX Multicast     1
>>   Packet TX (64)   31            Packet TX (127)  18042
>>   Packet TX (255)  20            Packet TX (511)  34
>>   Packet TX (1023) 5             Packet TX (1522) 219383
>>
>>
>>
>>
>>
>> ----- Original Message ----- 
>> From: "Adam Greene" <maillist at webjogger.net>
>> To: <cisco-nsp at puck.nether.net>
>> Sent: Wednesday, August 22, 2007 2:14 PM
>> Subject: Re: [c-nsp] 7204vxr freeze-up question
>>
>>
>> > Thanks, Chuck and Rodney.
>> >
>> > CEF is enabled. I'm sending about 95Mbps of 1470-byte UDP packets to 
>> > the
>> > PA-GE interface, then the dueling gateways is trying to push that 
>> > traffic
>> > even higher. The router is connected to a radio that can only do 
>> > 100Mbps,
>> > so
>> > there's no chance of traffic exceeding 100Mbps in either direction.
>> >
>> > I'll try 12.4(16) and see what happens.
>> >
>> > Here's the 'show int' during the outage condition (particularly 
>> > worrisome
>> > to
>> > me are the 18 interface resets, all caused by the test. There are also 
>> > a
>> > lot
>> > of output drops, which is understandable, 1 no buffer and 1 throttle):
>> >
>> > r2#sh int g2/0
>> >
>> > GigabitEthernet2/0 is up, line protocol is up
>> >
>> >  Hardware is WISEMAN, address is 0007.8420.e838 (bia 0007.8420.e838)
>> >
>> >  Description: *** Wireless Network Mgmt VLAN ***
>> >
>> >  MTU 1500 bytes, BW 1000000 Kbit, DLY 10 usec,
>> >
>> >     reliability 255/255, txload 15/255, rxload 24/255
>> >
>> >  Encapsulation 802.1Q Virtual LAN, Vlan ID  1., loopback not set
>> >
>> >  Keepalive set (10 sec)
>> >
>> >  Unknown duplex, Unknown Speed, link type is autonegotiation, media 
>> > type
>> > is
>> > SX
>> >
>> >  output flow-control is on, input flow-control is on
>> >
>> >  ARP type: ARPA, ARP Timeout 04:00:00
>> >
>> >  Last input 00:00:00, output 00:00:00, output hang never
>> >
>> >  Last clearing of "show interface" counters never
>> >
>> >  Input queue: 11/75/0/0 (size/max/drops/flushes); Total output drops:
>> > 1173852
>> >
>> >  Queueing strategy: fifo
>> >
>> >  Output queue: 0/40 (size/max)
>> >
>> >  30 second input rate 96026000 bits/sec, 7955 packets/sec
>> >
>> >  30 second output rate 59502000 bits/sec, 5143 packets/sec
>> >
>> >     29510670 packets input, 2249605461 bytes, 1 no buffer
>> >
>> >     Received 1686266 broadcasts, 0 runts, 0 giants, 1 throttles
>> >
>> >     0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
>> >
>> >     0 watchdog, 1513394 multicast, 611 pause input
>> >
>> >     0 input packets with dribble condition detected
>> >
>> >     30538392 packets output, 3381947542 bytes, 0 underruns
>> >
>> >     0 output errors, 0 collisions, 18 interface resets
>> >
>> >     0 babbles, 0 late collision, 0 deferred
>> >
>> >     3 lost carrier, 0 no carrier, 2 pause output
>> >
>> >     0 output buffer failures, 0 output buffers swapped out
>> >
>> >
>> > I don't have a 'show controller' during the outage condition. I'll try 
>> > to
>> > obtain it.
>> >
>> > I agree, before we swap the router, we should do some more normal tests
>> > and
>> > see if the problem persists even then at 80Mbps in / 40Mbps out levels.
>> >
>> > Thanks,
>> > Adam
>> >
>> >
>> >
>> >
>> >
>> > ----- Original Message ----- 
>> > From: "Rodney Dunn" <rodunn at cisco.com>
>> > To: "Adam Greene" <maillist at webjogger.net>
>> > Cc: <cisco-nsp at puck.nether.net>
>> > Sent: Wednesday, August 22, 2007 12:13 PM
>> > Subject: Re: [c-nsp] 7204vxr freeze-up question
>> >
>> >
>> >> Can you get it in that condition and get a 'sh controller' and
>> >> 'show int'?
>> >>
>> >> It sounds like the ingress rx driver is locking up.
>> >>
>> >> Try the latest 12.4 mainline code (12.4(16)) if you have it in the
>> >> lab and see if it's there too.
>> >>
>> >> Rodney
>> >>
>> >> On Wed, Aug 22, 2007 at 11:45:29AM -0400, Adam Greene wrote:
>> >>> Hmm.
>> >>>
>> >>> Upgraded router to 12.3(23).
>> >>>
>> >>> Even after the upgrade, passing 82.3Mbps in / 42.5Mbps out over the
>> >>> 7204VXR's PA-GE interface (plus 1000BaseSX GBIC) causes the interface 
>> >>> to
>> >>> stop passing traffic.
>> >>>
>> >>> Reseating the GBIC does not rectify the issue. However, reseating the
>> >>> PA-GE
>> >>> card does.
>> >>>
>> >>> Tried moving the PA-GE card from slot 2 to slot 3 (different PCI bus)
>> >>> and
>> >>> the problem still occurs.
>> >>>
>> >>> Tried with a different PA-GE card. Problem still occurs.
>> >>>
>> >>> I'll try with another GBIC, but that seems unlikely to resolve the
>> >>> issue.
>> >>>
>> >>> It's sounding like I may need to replace this router. Ugh.
>> >>>
>> >>> If anyone has any bright ideas, they are welcome.
>> >>>
>> >>> Thanks,
>> >>> Adam
>> >>>
>> >>>
>> >>> ----- Original Message ----- 
>> >>> From: "Adam Greene" <maillist at webjogger.net>
>> >>> To: "Masood Ahmad Shah" <masood at nexlinx.net.pk>;
>> >>> <cisco-nsp at puck.nether.net>
>> >>> Sent: Friday, August 17, 2007 10:22 AM
>> >>> Subject: Re: [c-nsp] 7204vxr freeze-up question
>> >>>
>> >>>
>> >>> > Masood,
>> >>> >
>> >>> > Thanks for the advice. Current IOS is 12.2(13)T16. We'll look into
>> >>> > upgrading
>> >>> > it. I'll have to see what will support the NPE300; we're running 
>> >>> > very
>> >>> > few
>> >>> > features, though, so I don't expect to have an issue...
>> >>> >
>> >>> > The GBIC is plugged into a Bridgewave radio; power cycling the 
>> >>> > radio
>> >>> > does
>> >>> > not resolve the issue, only cycling the router does, so I think the
>> >>> > issue
>> >>> > is
>> >>> > on the router end. But we'll keep in mind the suggestion.
>> >>> >
>> >>> > Thanks again,
>> >>> > Adam
>> >>> >
>> >>> > ----- Original Message ----- 
>> >>> > From: "Masood Ahmad Shah" <masood at nexlinx.net.pk>
>> >>> > To: "'Adam Greene'" <maillist at webjogger.net>;
>> >>> > <cisco-nsp at puck.nether.net>
>> >>> > Sent: Wednesday, August 15, 2007 9:19 PM
>> >>> > Subject: RE: [c-nsp] 7204vxr freeze-up question
>> >>> >
>> >>> >
>> >>> >> Well, which IOS version you run?
>> >>> >>
>> >>> >> I know there are some issues with Intel chipset while it gets
>> >>> >> connected
>> >>> >> into
>> >>> >> cisco GBIC. I strongly suggest updating driver of NIC (if there 
>> >>> >> is),
>> >>> >> upgrade
>> >>> >> IOS or change your NIC to check it out...
>> >>> >>
>> >>> >>
>> >>> >> Regards,
>> >>> >> Masood Ahmad Shah
>> >>> >>
>> >>> >> -----Original Message-----
>> >>> >> From: cisco-nsp-bounces at puck.nether.net
>> >>> >> [mailto:cisco-nsp-bounces at puck.nether.net] On Behalf Of Adam 
>> >>> >> Greene
>> >>> >> Sent: Wednesday, August 15, 2007 8:43 PM
>> >>> >> To: cisco-nsp at puck.nether.net
>> >>> >> Subject: [c-nsp] 7204vxr freeze-up question
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> I'm running into an issue with a 7204VXR/NPE-300 router with 128MB
>> >>> >> RAM.
>> >>> >>
>> >>> >> A 1000Base-SX GBIC is plugged into one of the slots (not sure of 
>> >>> >> the
>> >>> >> part
>> >>> >> #
>> >>> >> of the card into which the GBIC plugs).
>> >>> >>
>> >>> >> We were running some dueling gateways speed tests with the router
>> >>> >> (packet
>> >>> >> stream is sent via iPerf to router A, which forwards it to router 
>> >>> >> B,
>> >>> >> which
>> >>> >> forwards it back to router A, which forwards it back to router B,
>> >>> >> until
>> >>> >> TTL
>> >>> >> is decremented to 0).
>> >>> >>
>> >>> >> Soon after I start sending 75Mbps - 80Mbps of traffic to the 
>> >>> >> router's
>> >>> >> gig
>> >>> >> interface via iPerf, the gig interface stops sending / receiving 
>> >>> >> any
>> >>> >> traffic
>> >>> >> whatsoever. The CLI of the router remains up, the gig interface
>> >>> >> reports
>> >>> >> it
>> >>> >> is up / up, memory and cpu utilization remain low. No logs are
>> >>> >> generated.
>> >>> >> Traffic on other interfaces is unaffected. I shut / no shut the
>> >>> >> gigabit
>> >>> >> interface, but traffic still refuses to pass. Only a reload of the
>> >>> >> router
>> >>> >> rectifies the issue.
>> >>> >>
>> >>> >> I wonder if there is a debug command that could provide some 
>> >>> >> insight
>> >>> >> into
>> >>> >> the problem. At this point I am suspecting a hardware issue (GBIC,
>> >>> >> card,
>> >>> >> or
>> >>> >> backplane).
>> >>> >>
>> >>> >> Thanks for any insights ....
>> >>> >>
>> >>> >> Adam
>> >>> >>
>> >>> >> _______________________________________________
>> >>> >> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> >>> >> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> >>> >> archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> > _______________________________________________
>> >>> > cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> >>> > https://puck.nether.net/mailman/listinfo/cisco-nsp
>> >>> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> >>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> >>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >>
>> >>
>> >>
>> >>
>> >>
>> >
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> > https://puck.nether.net/mailman/listinfo/cisco-nsp
>> > archive at http://puck.nether.net/pipermail/cisco-nsp/
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>>
>>
>> _______________________________________________
>> cisco-nsp mailing list  cisco-nsp at puck.nether.net
>> https://puck.nether.net/mailman/listinfo/cisco-nsp
>> archive at http://puck.nether.net/pipermail/cisco-nsp/
>
>
>
>
> 







More information about the cisco-nsp mailing list