[j-nsp] curious optic issue

Saku Ytti saku at ytti.fi
Mon Jan 28 02:50:06 EST 2013


I've seen at least twice now issues on XFP where temperature starts to draw
weird saw-tooth, like this http://ytti.fi/ddm2.png

When this occurs it can be accompanied by messages such as:

- tfeb0 MQchip 0 XE 0 Throttle: %PFE-4: Last 254 seconds have seen interrupt throttling at least once per second
- MQchip 0 XE 0 Throttle: Last 10 seconds have seen interrupt throttling at least once per second

When it does occur interface may flap 10 times an hour, 2 times a day or
anything in between.

The scary part is, all other local ISIS/LDP might flap _AND_ all far-end
(JNPR) router ISIS/LDP might flap.
Only interface seeing actual ifdown is the link with the affected optic,
other interfaces just appear to stop sending ISIS hellos

Obviously I'm not running JNPR optic, but the two optics I've seen, have
been from different vendors, who are not using same source. I think I've
only seen it on 11.4R3 so far.

Fix is to remove+reinsert optic, or reload router. I've not yet tried
'test xfp 1 power off|on', but I'm guessing it'll help too.

Anyone else seen this? My best guess is for some reason JNPR does something
which causes the optic to do something which raises interrupt. And as it
propagates to far-end, it should be something like maybe autonego? Maybe
clock election? Which may cause both parties to police interrupts which
might explain why ISIS on unrelated interfaces might timeout?  Highly
speculative explanation, but it's all I've got.


I can't be arsed to open JTAC case, I tried with one batch of SFP which
will crash PFE on every MX (due to I2C being too slow to answer), but JNPR
wasn't interested in fixing that, as obviously it's not bug, since it does
not happen on JNPR stickered optics.
At least now I can go to DC, plug with my JNPR-crash-tool and crash
competitors routers with no traces in syslogs.

-- 
  ++ytti


More information about the juniper-nsp mailing list