[f-nsp] High CPU MLX-4

Tim Warnock timoid at timoid.org
Wed Apr 19 19:13:34 EDT 2017


What does a `debug packet capture` and `show task` look like (rconsole lp X + enable mode)?

-----Original Message-----
From: foundry-nsp [mailto:foundry-nsp-bounces at puck.nether.net] On Behalf Of Joe Lao
Sent: Thursday, 20 April 2017 12:34 AM
To: Eldon Koyle <ekoyle+puck.nether.net at gmail.com>
Cc: foundry-nsp <foundry-nsp at puck.nether.net>
Subject: Re: [f-nsp] High CPU MLX-4

----------
Total Packets Received:                  479429
MPLS uplink packets received:            0
    VPLS packets received:               0
    VLL packets received:                0
    L3 VPN packets received:             0
    Other MPLS packets received:         0
ARP packets received:                    377
    ARP request packets received:        353
    ARP response packets received:       353
IPV4 packets received:                   478284
    IPv4 unicast packets routed:         0
    IPv4 protocol packets received:      32
    GRE tunnel packets received:         478252
    6to4 tunnel packets received:        0
IPV6 packets received:                   755
    IPv6 unicast packets routed:         0
    IPv6 protocol packets received:      700                      
IPv4 multicast packets routed:           0
IPv6 multicast packets routed:           0
L2VPN endpoint packets received:         0
    VPLS endpoint packets received:      0
    VLL endpoint packets received:       0
    Local-VLL endpoint packets received: 0
L2 packets received:                     1075
    L2 known unicast packets forwarded:  0
    L2 unknown unicast packets flooded:  0
    L2 broadcast Packets flooded:        353
    L2 multicast Packets flooded:        722
    Packets received for SA learning:    55
Other packets received:                  0
Total Packets dropped:                   13

Packet drop causes:
        13 (56-Ipv6 protocol drop(PFE))                           
ARP packets captured for DAI:            377
ARP packets failed DAI:                  0
Per port packet counters:
    Packets received on port 1/1:        479409
    Packets received on port 1/2:        20
    Packets received on port 1/3:        0
    Packets received on port 1/4:        0
 
 
After 20 seconds
Sent: Wednesday, April 19, 2017 at 9:22 PM
From: "Eldon Koyle" <ekoyle+puck.nether.net at gmail.com>
To: "Joe Lao" <Joelao8392 at mail.com>
Cc: foundry-nsp <foundry-nsp at puck.nether.net>, "Perrin Richardson" <perrin.richardson at me.com>
Subject: Re: [f-nsp] High CPU MLX-4
Have you checked the output of `dm pstat 1` ?  It resets counters each run, so I usually ignore the output of the first run, wait 10-30 seconds, and run it again.  It shows the kind of packets and counts that are hitting the lp cpu. 
 
-- 
Eldon
  
On Apr 19, 2017 6:35 AM, "Joe Lao" <Joelao8392 at mail.com <mailto:Joelao8392 at mail.com> > wrote: 

	On MLX-4-2
	 
	sh int brief
	Port   Link     Port-State   Dupl Speed Trunk Tag Priori MAC            Name           Type              
	1/1    Up       Forward      Full 10G    
	1/2    Up       Forward      Full 10G   
	1/3    Up       Forward      Full 10G  
	1/4    Up       Forward      Full 10G  
	mgmt1  Up       Forward      Full 1G   
	 
	MLX-4-1 (Problematic Unit)
	 
	sh int brief
	Port   Link     Port-State   Dupl Speed Trunk Tag Priori MAC            Name           Type              
	1/1    Up       Forward      Full 10G  
	1/2    Up       Forward      Full 10G  
	1/3    Disabled None         None None
	1/4    Disabled None         None None   
	mgmt1  Up       Forward      Full 1G   
	 
	sh cpu lp 1
	SLOT  #:               LP CPU UTILIZATION in  %:
	             in 1 second:  in 5 seconds:  in 60 seconds: in 300 seconds:
	     1:        95            95             95              95
	 
	 
	 
	Sent: Wednesday, April 19, 2017 at 2:37 PM
	From: "Perrin Richardson" <perrin.richardson at me.com <mailto:perrin.richardson at me.com> >
	To: "Iain Robertson" <iain.robertson at gmail.com <mailto:iain.robertson at gmail.com> >
	Cc: "Joe Lao" <Joelao8392 at mail.com <mailto:Joelao8392 at mail.com> >, foundry-nsp <foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> > 

	Subject: Re: [f-nsp] High CPU MLX-4
	+1 :) 
	 
	  
	On 19 Apr 2017, at 4:40 PM, Iain Robertson <iain.robertson at gmail.com <mailto:iain.robertson at gmail.com> > wrote:
	  
	Are all unused interfaces in the disabled state? 
	 
	I've seen a circumstance where, with some optics, an enabled interface with no remote device connected to it results in high LP CPU on the affected line cards.  Workaround in that case was to ensure that all disused/disconnected interfaces are disabled.
	 
	 
	  
	On 19 April 2017 at 14:40, Joe Lao <Joelao8392 at mail.com <mailto:Joelao8392 at mail.com> > wrote: 

		Boot     : Version 5.6.0T165 Copyright
		 
		(config)#sh conf | inc icmp
		no ip icmp redirects
		 
		on both
		 
		 
		  
		Sent: Wednesday, April 19, 2017 at 8:57 AM
		From: "Eldon Koyle" <ekoyle+puck.nether.net at gmail.com <mailto:ekoyle%2Bpuck.nether.net at gmail.com> >
		To: "Joe Lao" <Joelao8392 at mail.com <mailto:Joelao8392 at mail.com> >
		Cc: foundry-nsp <foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> >
		Subject: Re: [f-nsp] High CPU MLX-4
		Have you disabled icmp redirects?  That is a common cause of unexplained high cpu utilization.  I think the command is: no ip redirect (either interface or global). 
		 
		Also, which code version are you running?
		 
		-- 
		Eldon
		  
		On Apr 18, 2017 7:14 PM, "Joe Lao" <Joelao8392 at mail.com <mailto:Joelao8392 at mail.com> > wrote: 

			Hello List
			 
			My colleague posted on this list last month about a LP CPU issue experienced on MLX routers with GRE tunnels
			 
			The issue did not resolve itself instead we asked our customers to not send outbound traffic through us
			 
			However a new issue has arised
			 
			 
			Our topography is as follows
			 
			CARRIER A -----> MLX-4-1 ---- MLX-4-2 ----> CARRIER B .. Carrier B connection is specifically designed to tank attacks, carrier A backhauls clean/protected traffic
			 
			MLX-4-2 holds our GRE tunnels
			 
			 
			Now we are seeing 95% LP CPU on MLX-4-1 and a packet capture shows only GRE packets from MLX-4-2 destined for the customers GRE endpoint
			 

			SLOT  #:               LP CPU UTILIZATION in  %:
			             in 1 second:  in 5 seconds:  in 60 seconds: in 300 seconds:
			     1:        94            94             94              94
			 
			 
			LP-1#show tasks
			  Task Name       Pri  State  PC        Stack       Size  CPU Usage(%)  task vid
			  --------------  ---  -----  --------  --------  ------  ------------  --------
			  con              27  wait   0005c710  040c5dc8   32768             0  0
			  mon              31  wait   0005c710  041b7f10    8192             0  0
			  flash            20  wait   0005c710  041c6f40    8192             0  0
			  dbg              30  wait   0005c710  041beec0   16384             0  0
			  main              3  wait   0005c710  23cc6f40  262144             1  101
			  LP-I2C            3  wait   0005c710  27d70ee0    4096             0  101
			  LP-Assist         3  wait   0005c710  29bbef00   32768             0  101
			  LP-FCopy          3  wait   0005c710  29bc3f00   16384             0  101
			  LP-VPLS-Offld     3  wait   0005c710  29bc8f00   16384             0  101
			  LP-OF-Offld       3  wait   0005c710  29bcdf00   16384             0  101
			  LP-TM-Offld       3  wait   0005c710  29bd2f00   16384             0  101
			  LP-Stats          3  wait   0005c710  29bd7f60   16384             0  101
			  LP-IPC            3  wait   0005c710  29c18f00  262144             0  101
			  LP-TX-Pak         3  wait   0005c710  29c21f00   32768             0  101
			  LP-RX-Pak         3  wait   0005c710  29c42f38  131072            97  101
			  LP-SYS-Mon        3  wait   0005c710  29c47f28   16384             0  101
			  LP-RTD-Mon        3  wait   0005c710  29c4cf08   16384             0  101
			  LP-Console        3  ready  20b636c0  29c6df78  131072             0  101
			  LP-CPU-Mon        3  wait   0005c710  29c96f40  163840             0  101
			 
			 
			MLX-4-2       Client GRE endpoint
			xxxxxxxx -> xxxxx [Protocol:47]
			**********************************************************************
			[ppcr_rx_packet]: Packet received
			Time stamp : 00 day(s) 00h 14m 33s:,
			TM Header: [ 8026 2000 0000 ]
			Type: Fabric Unicast(0x00000008) Size: 152 Parity: 2 Src IF: 0
			Src Fap: 0 Dest Port: 0  Src Type: 0 Class: 0x00000000
			**********************************************************************
			Packet size: 146, XPP reason code: 0x00004747
			 
			Traffic levels are very low , the connection to carrier A shows approximately 40Mbps
			 
			LP CPU on MLX-4-2 is
			 
			SLOT  #:               LP CPU UTILIZATION in  %:
			             in 1 second:  in 5 seconds:  in 60 seconds: in 300 seconds:
			     1:        1             1              1               1 
			 
			 
			As a test I shut the port between MLX-4-1 and MLX-4-2  immediately CPU usage dropped to 1% on MLX-4-1
			 
			 
			No protocols over GRE tunnel we announce /24 and such and route throught the tunnel using static route
			 
			 
			 
			Show port on MLX-4-1 to MLX-4-2
			 
			  Port is not enabled to receive all vlan packets for pbr
			  MTU 1548 bytes, encapsulation ethernet
			  Openflow: Disabled, Openflow Index 1
			  Cluster L2 protocol forwarding enabled
			  300 second input rate: 64599535 bits/sec, 61494 packets/sec, 0.74% utilization
			  300 second output rate: 2468 bits/sec, 4 packets/sec, 0.00% utilization
			  82862765 packets input, 10844340289 bytes, 0 no buffer
			  Received 25656 broadcasts, 27667 multicasts, 82809442 unicasts
			  0 input errors, 0 CRC, 0 frame, 0 ignored                       
			  0 runts, 0 giants
			  NP received 82871502 packets, Sent to TM 82860777 packets
			  NP Ingress dropped 10729 packets
			  9484 packets output, 726421 bytes, 0 underruns
			  Transmitted 127 broadcasts, 553 multicasts, 8804 unicasts
			  0 output errors, 0 collisions
			  NP transmitted 9485 packets, Received from TM 48717 packets
			 
			Show port on MLX-4-2 to MLX-4-1
			 
			Port is not enabled to receive all vlan packets for pbr
			  MTU 1548 bytes, encapsulation ethernet
			  Openflow: Disabled, Openflow Index 1
			  Cluster L2 protocol forwarding enabled
			  300 second input rate: 2416 bits/sec, 3 packets/sec, 0.00% utilization
			  300 second output rate: 64189791 bits/sec, 61109 packets/sec, 0.74% utilization
			  5105571056 packets input, 760042160157 bytes, 0 no buffer
			  Received 1874232 broadcasts, 5287030 multicasts, 5098409794 unicasts
			  0 input errors, 0 CRC, 0 frame, 0 ignored                       
			  0 runts, 0 giants
			  NP received 5105571056 packets, Sent to TM 5105113719 packets
			  NP Ingress dropped 457337 packets
			  590086066756 packets output, 81697023432476 bytes, 0 underruns
			  Transmitted 129784095 broadcasts, 208762136 multicasts, 589747520525 unicasts
			  0 output errors, 0 collisions
			  NP transmitted 590086072891 packets, Received from TM 590091974310 packets
			 
			 
			Cheers
			 

			_______________________________________________
			foundry-nsp mailing list
			foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> 
			http://puck.nether.net/mailman/listinfo/foundry-nsp


		_______________________________________________
		foundry-nsp mailing list
		foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> 
		http://puck.nether.net/mailman/listinfo/foundry-nsp

	_______________________________________________
	foundry-nsp mailing list
	foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> 
	http://puck.nether.net/mailman/listinfo/foundry-nsp

	_______________________________________________
	foundry-nsp mailing list
	foundry-nsp at puck.nether.net <mailto:foundry-nsp at puck.nether.net> 
	http://puck.nether.net/mailman/listinfo/foundry-nsp



More information about the foundry-nsp mailing list