[j-nsp] EX4200 VC PFE crashes

David Siebörger drs at sieborger.nom.za
Thu Jan 17 08:38:48 EST 2013


Hi,

I've experienced something at least slightly similar.  We have VC pairs of
EX4200s as campus distribution, acting as the default gateways for end-user
subnets ranging from /27s to (in one case) a /21, also with OSPF + OSPFv3
and LAGs down to access switches.

The symptoms I've seen most often are different to yours: one of the VCs
will suddenly stop responding on ARP/NDP requests from some users' PCs any
time after two weeks of uptime.  Digging in the pfe shows that the affected
PCs have nhdb entries in the "hold" state.  (The other VCs also do the same
thing, though much less regularly.)  Rebooting the master fixes the problem
-- for another two weeks.  I've experienced the same thing while running
JUNOS 10.4R9, 11.1R2, and 12.1R1.

However, on one occasion pfem crashed and left a core dump, as you've
described.  pfem restarted and traffic returned to normal within a minute or
two.  JTAC analysed the core dump resulting in PR790201, for which a fix is
in recent releases of 12.x:

https://prsearch.juniper.net/InfoCenter/index?page=prcontent&id=PR790201

JTAC have now told me that both sets of symptoms are addressed by that fix. 
I've deployed 12.1R4 on the worst-affected VC and it's now at 28 days uptime
without incident.  I'm not celebrating yet because our university is still
on summer vacation so network usage is lower than normal, but so far so
good....


-- 
David Siebörger
drs at sieborger.nom.za



More information about the juniper-nsp mailing list