[j-nsp] Ex stack of 4 switchs stops routing, switching, ...

Laurent CARON lcaron at unix-scripts.info
Sun Jan 5 11:52:19 EST 2014


Hi,

Running a chassis composed of 2 EX4200 and 2 EX4500.

One of the RE did reboot (by itself) on Dec 26th.

I managed to collect some logs:

Dec 25 12:21:41  swa eventd: sendto: Cannot allocate memory
Dec 25 12:21:44  swa /kernel: rt_pfe_veto: Memory over consumed. Op 1, rtsm_id 47, msg type 2
Dec 25 12:21:59  swa last message repeated 3 times
Dec 25 12:22:49  swa last message repeated 10 times
Dec 25 12:22:54  swa /kernel: rt_pfe_veto: Memory over consumed. Op 1, rtsm_id 47, msg type 2
Dec 25 12:22:59  swa /kernel: rt_pfe_veto: Memory over consumed. Op 1, rtsm_id 47, msg type 2
....
Jan  5 12:50:37  swa /kernel: rt_pfe_veto: Memory over consumed. Op 8, rtsm_id 0, msg type 10
Jan  5 12:50:38  swa rpd[20440]: RPD_KRT_Q_RETRIES: Route Update: No buffer space available
Jan  5 12:50:42  swa /kernel: rt_pfe_veto: Memory over consumed. Op 8, rtsm_id 
...
Jan  5 15:09:17  swa /kernel: rt_pfe_veto: Memory over consumed. Op 8, rtsm_id 0, msg type 10
Jan  5 15:09:17  swa /kernel: rt_pfe_veto: Possible slowest client is pfem2. States processed - 117754359. States to be processed - 27
Jan  5 15:09:22  swa /kernel: rt_pfe_veto: Memory over consumed. Op 8, rtsm_id 0, msg type 10
Jan  5 15:09:22  swa /kernel: rt_pfe_veto: Possible slowest client is pfem2. States processed - 117754359. States to be processed - 27
Jan  5 15:09:26  swa rpd[20440]: RPD_KRT_Q_RETRIES: Route Update: No buffer space available

Today the switch would only continue switching for a while but not route packets anymore.

The arp table was empty.

Restarting routing process only rendered the switch unresponsive so I
had to reboot it via console port.

This switch only handles 3 dozens of LACP aggregates, a few of them are
10Gb, the others Gb, a few SVI, a few pure L2 VLANs, 100 firewall rules,
no dhcp snooping. I use OSPF on ~30 interfaces

The only "fancy" features I use are:
Graceful switchover
RSTP
LLDP
LLDP-Med
NSB
Ethernet storm control

Do any of you have a clue about it ?

Thanks

Laurent



More information about the juniper-nsp mailing list