[c-nsp] Cisco 7204VXR with NPE-G1 high CPU and output drops
Yap Chin Hoong -
yapchinhoong at hotmail.com
Thu Apr 22 10:11:59 EDT 2010
Hi Youssef, Try to capture the output of the show proc cpu sorted during high CPU, let's see whether the problem is caused by interrupts or processes. Will you be able to open a Cisco TAC case and request the TAC engineer to perform a CPU profiling regarding this? regards, YapCH http://itcertguides.blogspot.com/
Date: Thu, 22 Apr 2010 15:49:33 +0200
Subject: Re: [c-nsp] Cisco 7204VXR with NPE-G1 high CPU and output drops
From: youssef at 720.fr
To: yapchinhoong at hotmail.com
CC: cisco-nsp at puck.nether.net
Hello,
Really, nothing related to the CPU :
LNS1.IX1#sh processes cpu history
2222222223333322222444444444455555333332222222222222222222
6666666666666699999444449999911111666666666688888888888888
100
90
80
70
60
50 **********
40 ***** ********************
30 **********************************************************
20 **********************************************************
10 **********************************************************
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per second (last 60 seconds)
4434344555655555455455555454555555555555555555555555565555
8253536886175400933900221909246456366223331226663324503316
100
90
80
70
60 ****** * ** ** *** ** *
50 * **###########*####################################*#
40 *******#################################################*#
30 ##########################################################
20 ##########################################################
10 ##########################################################
0....5....1....1....2....2....3....3....4....4....5....5....
0 5 0 5 0 5 0 5 0 5
CPU% per minute (last 60 minutes)
* = maximum CPU% # = average CPU%
6567544433244568899987786666543422242347557976575555443422242335455655
1917692522895458476603000342305575449131462755826358236305308599538984
100 *** *
90 * *** *
80 * ****** * **
70 * ***###**** * **** * *
60 ***** *#######****** * ******** ** * ***
50 ####** * ***##############* * ************* *******
40 ######** **################*** * *######*********** * ***####**
30 ########****#################***** ***##############**** * ***########
20 #################################*######################***###########
10 ######################################################################
0....5....1....1....2....2....3....3....4....4....5....5....6....6....7.
0 5 0 5 0 5 0 5 0 5 0 5 0
CPU% per hour (last 72 hours)
* = maximum CPU% # = average CPU%
LNS1.IX1#sh processes cpu sorted | e 0.00
CPU utilization for five seconds: 32%/28%; one minute: 37%; five minutes: 36%
PID Runtime(ms) Invoked uSecs 5Sec 1Min 5Min TTY Process
167 148648548 669722691 221 1.11% 0.94% 0.71% 0 LFDp Input Proc
92 84501540 658096034 128 0.55% 0.48% 0.38% 0 IP Input
158 29882772 19491105 1533 0.47% 0.50% 0.50% 0 CEF: IPv4 proces
44 21898608 2506816 8735 0.31% 0.16% 0.17% 0 Net Background
163 1811203169702554 0 0.23% 0.15% 0.16% 0 HQF Input Shaper
230 13727516 2560015 5362 0.23% 0.19% 0.18% 0 Compute load avg
238 4050992 17914481 226 0.15% 0.04% 0.05% 0 L2TP mgmt daemon
237 3963624 303695077 13 0.15% 0.14% 0.15% 0 L2X Data Daemon
260 1429340 404591201 3 0.15% 0.08% 0.08% 0 PPP Events
162 2102403169697698 0 0.15% 0.28% 0.27% 0 HQF Shaper Backg
234 10364 4211 2461 0.07% 0.08% 0.03% 2 Virtual Exec
84 61120 398983803 0 0.07% 0.03% 0.02% 0 ACCT Periodic Pr
254 126412 399597741 0 0.07% 0.06% 0.07% 0 PPP manager
I have been digging around in the archives, looks like I'm victim of microburst trafic :
http://puck.nether.net/pipermail/cisco-nsp/2009-April/060158.html
What can one do about it exept buy bigger / faster boxes with no garranty it would work !?!
Regards.
Y.
2010/4/22 Yap Chin Hoong - <yapchinhoong at hotmail.com>
Hi Youssef, kindly provide the output of the 'show proc cpu sorted' and 'show proc cpu history'. Thanks.
regards,YapCHhttp://itcertguides.blogspot.com/
> Date: Thu, 22 Apr 2010 15:06:11 +0200
> From: Youssef Bengelloun-Zahr <youssef at 720.fr>
> To: cisco-nsp at puck.nether.net
> Subject: [c-nsp] Cisco 7204VXR with NPE-G1 high CPU and output drops
> Message-ID:
> <x2wcd86f9451004220606rd835ed28w86654612b4c76d69 at mail.gmail.com>
> Content-Type: text/plain; charset=windows-1252
>
> Hello community,
>
> I Have a Cisco 7204VXR router with NPE-G1 that started acting weird for 24
> hours. This router is dual attached to two 6k5 routers using multimode FO
> and SX GBICs.
>
> The router is used as an LNS termination point for PPPoVPDN sessions, we
> have a bunch of them.
>
> Here is a show ver output :
>
> LNS1.IX1#sh version
> Cisco IOS Software, 7200 Software (C7200-ADVENTERPRISEK9-M), Version
> 12.2(33)SRD, RELEASE SOFTWARE (fc2)
> Technical Support: http://www.cisco.com/techsupport
> Copyright (c) 1986-2008 by Cisco Systems, Inc.
> Compiled Thu 23-Oct-08 12:58 by prod_rel_team
>
> ROM: System Bootstrap, Version 12.3(4r)T1, RELEASE SOFTWARE (fc1)
> BOOTLDR: 7200 Software (C7200-KBOOT-M), Version 12.3(5a), RELEASE SOFTWARE
> (fc1)
>
> LNS1.IX1 uptime is 21 weeks, 23 hours, 22 minutes
> System returned to ROM by reload at 13:36:53 UTC Wed Nov 25 2009
> System restarted at 13:39:45 UTC Wed Nov 25 2009
> System image file is "disk0:c7200-adventerprisek9-mz.122-33.SRD.bin"
> Last reload type: Normal Reload
> Last reload reason: Reload command
>
>
>
> This product contains cryptographic features and is subject to United
> States and local country laws governing import, export, transfer and
> use. Delivery of Cisco cryptographic products does not imply
> third-party authority to import, export, distribute or use encryption.
> Importers, exporters, distributors and users are responsible for
> compliance with U.S. and local country laws. By using this product you
> agree to comply with applicable laws and regulations. If you are unable
> to comply with U.S. and local laws, return this product immediately.
>
> A summary of U.S. laws governing Cisco cryptographic products may be found
> at:
> http://www.cisco.com/wwl/export/crypto/tool/stqrg.html
>
> If you require further assistance please contact us by sending email to
> export at cisco.com.
>
> Cisco 7204VXR (NPE-G1) processor (revision B) with 983040K/65536K bytes of
> memory.
> Processor board ID 29498611
> SB-1 CPU at 700Mhz, Implementation 0x401, Rev 0.2, 512KB L2 Cache
> 4 slot VXR midplane, Version 2.7
>
> Last reset from power-on
>
> PCI bus mb1 (Slots 1, 3 and 5) has a capacity of 600 bandwidth points.
> Current configuration on bus mb1 has a total of 0 bandwidth points.
> This configuration is within the PCI bus capacity and is supported.
>
> PCI bus mb2 (Slots 2, 4 and 6) has a capacity of 600 bandwidth points.
> Current configuration on bus mb2 has a total of 0 bandwidth points.
> This configuration is within the PCI bus capacity and is supported.
>
> Please refer to the following document "Cisco 7200 Series Port Adaptor
> Hardware Configuration Guidelines" on Cisco.com <http://www.cisco.com>
> for c7200 bandwidth points oversubscription and usage guidelines.
>
>
> 1 FastEthernet interface
> 3 Gigabit Ethernet interfaces
> 509K bytes of NVRAM.
>
> 1000944K bytes of ATA PCMCIA card at slot 0 (Sector size 512 bytes).
> 1000944K bytes of ATA PCMCIA card at slot 1 (Sector size 512 bytes).
> 62592K bytes of ATA PCMCIA card at slot 2 (Sector size 512 bytes).
> 16384K bytes of Flash internal SIMM (Sector size 256K).
> Configuration register is 0x2102
>
>
> Starting yesterday afternoon, I saw appear high CPU usage and numerous
> output drops. My first instinct was that GBICs started dying so I replaced,
> no change.
>
> Then, I thought we were victim of a DDoS but my graphs show no increase of
> number of packets or things like that.
>
> I have been debugging this and found out this :
>
> LNS1.IX1#sh interfaces gi0/2 controller
> GigabitEthernet0/2 is up, line protocol is up
> Hardware is BCM1250 Internal MAC, address is 000b.fcdd.c41a (bia
> 000b.fcdd.c41a)
> Description: F=B, E=BB2.IX1, P=Gi9/1
> Internet address is 77.246.80.101/31
> MTU 9216 bytes, BW 1000000 Kbit, DLY 10 usec,
> reliability 255/255, txload 29/255, rxload 34/255
> Encapsulation 802.1Q Virtual LAN, Vlan ID 1., loopback not set
> Keepalive set (10 sec)
> Full Duplex, 1000Mbps, link type is auto, media type is SX
> output flow-control is XON, input flow-control is XON
> ARP type: ARPA, ARP Timeout 04:00:00
> Last input 00:00:00, output 00:00:00, output hang never
> Last clearing of "show interface" counters 00:22:35
> Input queue: 1/150/0/0 (size/max/drops/flushes); Total output drops: 2712
> *Queueing strategy: Class-based queueing*
> Output queue: 311/1000/0 (size/max total/drops)
> 5 minute input rate 133959000 bits/sec, 28885 packets/sec
> 5 minute output rate 115013000 bits/sec, 20137 packets/sec
> 39283651 packets input, 1536817418 bytes, 0 no buffer
> Received 535 broadcasts (0 IP multicasts)
> 0 runts, 0 giants, 0 throttles
> 164 input errors, 0 CRC, 0 frame, 164 overrun, 0 ignored
> 0 watchdog, 539 multicast, 0 pause input
> 27456129 packets output, 2625296538 bytes, 0 underruns
> 0 output errors, 0 collisions, 0 interface resets
> 0 babbles, 0 late collision, 0 deferred
> 0 lost carrier, 0 no carrier, 0 pause output
> 0 output buffer failures, 0 output buffers swapped out
> Interface GigabitEthernet0/2 (idb 0x50098218)
> Hardware is BCM1250 Internal MAC (Revision B2/B3)
> Network connection mode is AUTO
> network link is up
> Config is 1000Mbps, Full Duplex
> Selected media-type is GBIC
> GBIC type is 1000BaseSX
> MAC Registers:
> mac_cfg = 0x000000C8000A0176, mac_thrsh_cfg = 0x0000080400084004
> mac_vlantag = 0x0000000000000000, mac_frame_cfg = 0x241C400000280200
> mac_adfilter_cfg = 0x0000000000000E28, mac_enable = 0x0000000000000C11
> mac_status = 0x0000000000040004, mac_int_mask = 0x00004F0000C300C3
> mac_txd_ctl = 0x000000000000000F, mac_eth_addr = 0x00001AC4DDFC0B00
> mac_fifo_ptrs = 0x241C400000280200, mac_eopcnt = 0x000044001B1B1B1B
> MAC RX is enabled RX DMA - channel 0 is enabled, channel 1 is disabled
> MAC TX is enabled TX DMA - channel 0 is enabled, channel 1 is disabled
> Device status = 1000 Mbps, Full-Duplex
> PHY Registers:
> PHY is Marvell 88E1011S (Rev 1.3)
> Control = 0x1000 Status = 0x796D
> PHY ID 1 = 0x0141 PHY ID 2 = 0x0C62
> Auto Neg Advertisement = 0x01A0 Link Partner Ability = 0x4120
> Auto Neg Expansion = 0x0000 Next Page Tx = 0x2001
> Link Partner Next Page = 0x0000 1000BaseT Control = 0x0000
> 1000BaseT Status = 0x0000 Extended Status = 0xC000
> PHY Specific Control = 0x0008 PHY Specific Status = 0xAD04
> Interrupt Enable = 0x6C00 Interrupt Status = 0x0000
> Ext PHY Spec Control = 0x0C64 Receive Error Counter = 0x0000
> LED Control = 0x4100
> Ext PHY Spec Control 2 = 0x006A Ext PHY Spec Status = 0xA017
> PHY says Link is UP, Speed 1000Mbps, Full-Duplex [AUTONEG Done]
> Physical Interface - GBIC
> AUTONEG - Our ability is 1000M/FD Pause Capable (Asymmetric)
> AUTONEG - Partner ability is 1000M/FD
> GBIC registers:
> Register 0x00: 01 04 01 00 00 00 01 20
> Register 0x08: 40 0C 01 01 0D 00 00 00
> Register 0x10: 37 1E 00 00 4F 45 4D 20
> Register 0x18: 20 20 20 20 20 20 20 20
> Register 0x20: 20 20 20 20 00 00 00 00
> Register 0x28: 47 42 49 43 2D 53 58 20
> Register 0x30: 20 20 20 20 20 20 20 20
> Register 0x38: 00 00 00 00 03 52 00 BA
> Register 0x40: 00 1A 00 00 42 31 30 34
> Register 0x48: 38 31 39 31 20 20 20 20
> Register 0x50: 20 20 20 20 30 39 30 39
> Register 0x58: 32 39 20 20 68 B0 01 5A
> Register 0x60: 20 20 20 20 20 20 20 20
> Register 0x68: 20 20 20 20 20 20 20 20
> Register 0x70: 20 20 20 20 20 20 20 20
> Register 0x78: 20 20 20 20 20 20 20 20
> PartNumber: GBIC-SX
> PartRev: B
> SerialNo: B1048191
> Options: 0
> Length(9um/50um/62.5um): 000/550/300
> Date Code: 090929
> Gigabit Ethernet Codes: 1
> Internal Driver Information:
> lc_ip_turbo_fs = 0x6236BE18, ip_routecache = 0x11 (dfs = 0/mdfs = 0)
> rx cache size = 1000, rx cache end = 15
> max_mtu = 9244
> Software MAC address filter(hash:length/addr/mask/hits):
> need_af_check = 0
> 0x00: 0 ffff.ffff.ffff 0000.0000.0000 0
> 0x2E: 0 0900.2b00.0005 0000.0000.0000 0
> 0x2F: 0 0900.2b00.0004 0000.0000.0000 0
> 0x5C: 0 0100.5e00.0002 0000.0000.0000 0
> 0xC0: 0 0100.0ccc.cccc 0000.0000.0000 0
> 0xD6: 0 0180.c200.0014 0000.0000.0000 0
> 0xD7: 0 0180.c200.0015 0000.0000.0000 0
> 0xE6: 0 000b.fcdd.c41a 0000.0000.0000 0
> ring sizes: RX = 128, TX = 256
> rx_particle_size: 512
> Rx Channel 0:
> dma_config0 = 0x0010002000800888, dma_config1 = 0x002D000000600029
> dma_dscr_base = 0x000000000C218A40, dma_dscr_cnt = 0x0000000000000080
> dma_cur_dscr_a = 0x000010000C29FC82, dma_cur_dscr_b = 0x02D4000000000001
> dma_cur_daddr = 0x000080000C2190E0
> rxring = 0x0C218A40, shadow = 0x50098FA4, head = 26 (0x0C218BE0)
> rx_overrun=78512, rx_nobuffer=0, rx_discard=0
> Error Interrupts: rx_int_dscr = 0, rx_int_derr = 0, rx_int_drop = 53
> Tx Channel 0:
> dma_config0 = 0x0000000001001088, dma_config1 = 0x00B6000000000010
> dma_dscr_base = 0x000000000C219280, dma_dscr_cnt = 0x0000000000000000
> dma_cur_dscr_a = 0x00000F000C27F980, dma_cur_dscr_b = 0x0000000000000000
> dma_cur_daddr = 0x000000000C219430
> txring = 0x0C219280, shadow = 0x657D0A14, head = 164, tail = 165, tx_count
> = 1
> Error Interrupts: tx_int_dscr = 0, tx_int_derr = 0, tx_int_dzero = 0
> chip_state = 2, ds->tx_limited = 0
> throttled = 0, enabled = 0, disabled = 0
> reset=6(init=1, restart=5), auto_restart=1
> tx_underflow = 0, tx_overflow = 0
> rx_underflow = 0, rx_overflow = 0, filtered_pak=0
> descriptor mismatch = 0, fixed alignment = 52530
> bad length = 0 dropped, 0 corrected
> unexpected sop = 0
> Address Filter:
> Promiscuous mode OFF
> Exact match table (for unicast, maximum 8 entries):
> Entry 0 MAC Addr = 000b.fcdd.c41a
> (All other entries are empty)
> Hash match table (for multicast, maximum 8 entries):
> Entry 0 MAC Addr = 0100.0ccc.cccc
> Entry 1 MAC Addr = 0900.2b00.0004
> Entry 2 MAC Addr = 0900.2b00.0005
> Entry 3 MAC Addr = 0180.c200.0014
> Entry 4 MAC Addr = 0180.c200.0015
> Entry 5 MAC Addr = 0100.5e00.0002
> (All other entries are empty)
> Statistics:
> Rx Bytes 19767582054 Tx Bytes
> 19944011802
> Rx Good Packets 27498925 Tx Good Packets
> 27480126
> Rx Multicast 544
> Rx Broadcast 0
>
> Rx Bad Pkt Errors 0 Tx Bad Pkt Errors
> 0
> Rx FCS Errors 0 Tx FCS Errors
> 0
> Rx Runt Errors 0 Tx Runt Errors
> 0
> Rx Oversize Errors 0 Tx Oversize Errors
> 0
> Rx Length Errors 0 Tx Collisions
> 0
> Rx Code Errors 0 Tx Late Collisions
> 0
> Rx Dribble Errors 0 Tx Excessive Collisions
> 0
> Tx Abort Errors
> 0
>
>
> My queuing strategy went from FIFO to Class-based queuing !?! How is that
> possible ?
>
> Any ideas on what might be causing this ?
>
> Thanks.
>
> Regards.
>
> Y.
>
> --
> Youssef BENGELLOUN-ZAHR ??????????????????
> Ing?nieur R?seaux et T?l?coms
>
>
> Technopole de l'Aube en Champagne - BP 601 - 10901 TROYES Cedex 9
> Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
> Tel +33 (0) 825 000 720
> Tel. direct +33 (0) 1 77 35 59 14
> Tel. portable +33 (0) 6 22 42 63 80
> Email ybz at 720.fr
> ??????????????????????????????.....www.720.fr
>
_________________________________________________________________
The New Busy is not the too busy. Combine all your e-mail accounts with Hotmail.
http://www.windowslive.com/campaign/thenewbusy?tile=multiaccount&ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_4
_______________________________________________
cisco-nsp mailing list cisco-nsp at puck.nether.net
https://puck.nether.net/mailman/listinfo/cisco-nsp
archive at http://puck.nether.net/pipermail/cisco-nsp/
--
Youssef BENGELLOUN-ZAHR ………………………………………………
Ingénieur Réseaux et Télécoms
Technopole de l'Aube en Champagne - BP 601 - 10901 TROYES Cedex 9
Agence Paris : 6, rue Charles Floquet - 92120 MONTROUGE
Tel +33 (0) 825 000 720
Tel. direct +33 (0) 1 77 35 59 14
Tel. portable +33 (0) 6 22 42 63 80
Email ybz at 720.fr
……………………………………………………………………………….....www.720.fr
_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_2
More information about the cisco-nsp
mailing list