[j-nsp] MX80 watchdog
Tom Bird
tom at marmot.org.uk
Mon Jun 12 11:34:05 EDT 2023
Afternoon,
I've been upgrading some MX80 routers to from 15.1, consistently they
seem to fall over during periods of strenuous SSD access, or indeed once
during a "commit check".
We thought this might be due to the uptime (~1500 days) so have been
rebooting them prior to the upgrade which has mostly stopped the problem
from happening. Not completely, however - they get stuck for about an
hour doing this, after which they reboot and continue to work.
watchdog: scheduling fairness gone for 3540 seconds now.
(da1:umass-sim1:1:0:0): Synchronize cache failed, status == 0x34, scsi
status == 0x0
Automatic reboot in 15 seconds - press a key on the console to abort
Rebooting...
I'd like it if they waited a bit less than an hour and see the watchdog
can be configured but I can't find any useful documentation about
exactly what conditions it would fire and what the defaults are.
Currently there is no configuration under "system processes watchdog",
and it looks like it can be enabled, disabled and the timeout set up to
3600 seconds.
So my question is, is it this watchdog that is resetting the thing after
an hour and would it be reasonable to set the timeout to say 300 seconds
so there was less down time if it went wrong.
Thanks,
--
Tom
:: www.portfast.co.uk / @portfast
:: hosted services, domains, virtual machines, consultancy
More information about the juniper-nsp
mailing list