[j-nsp] MX80 watchdog

Tom Bird tom at marmot.org.uk
Mon Jun 12 11:34:05 EDT 2023


Afternoon,

I've been upgrading some MX80 routers to from 15.1, consistently they 
seem to fall over during periods of strenuous SSD access, or indeed once 
during a "commit check".

We thought this might be due to the uptime (~1500 days) so have been 
rebooting them prior to the upgrade which has mostly stopped the problem 
from happening.  Not completely, however - they get stuck for about an 
hour doing this, after which they reboot and continue to work.


watchdog: scheduling fairness gone for 3540 seconds now.
(da1:umass-sim1:1:0:0): Synchronize cache failed, status == 0x34, scsi 
status == 0x0
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...


I'd like it if they waited a bit less than an hour and see the watchdog 
can be configured but I can't find any useful documentation about 
exactly what conditions it would fire and what the defaults are.

Currently there is no configuration under "system processes watchdog", 
and it looks like it can be enabled, disabled and the timeout set up to 
3600 seconds.

So my question is, is it this watchdog that is resetting the thing after 
an hour and would it be reasonable to set the timeout to say 300 seconds 
so there was less down time if it went wrong.

Thanks,
-- 
Tom

:: www.portfast.co.uk / @portfast
:: hosted services, domains, virtual machines, consultancy


More information about the juniper-nsp mailing list