[j-nsp] kmd [2371]: JTASK_SCHED_SLIP: 13sec schedular slip
Saku Ytti
saku at ytti.fi
Sun Apr 26 02:56:58 EDT 2020
Hey Andrea,
> Seeing these regularly almost every 50mins on SRX345
>
> In the logs I see repeated VPN IKE negotiation failed events repeating hundred of times within few minutes for a VPN that needs to have its config completed.
>
> Could the many VPN events cause an RPD slip or they are unrelated?
>
> What else should be checked to troubleshoot cause? Unfortunately logs have rotated so cannot go back to see if the last VPN config (partly completed) is the cause of this or not?
Like Jared says, open JTAC, this is a bug in the code, these should
never happen. But they can be sometimes benign.
RPD like IOS and Windows 3 is cooperatively multitasking, unlike
pre-emptive multitasking your desktop and phone are. This means every
RPD task must yield its control, so another RPD task may run. This
means the developer must be very knowledgeable when something might
take a long time, that they'll allow it to yield, and when it's
guaranteed to take a short time, so they don't have to think about
yielding. But done right, cooperatively multitasking system is the
most efficient one.
A common mistake is, something they wrote, in their mind could never
possibly take long time, but then customer config comes along, which
causes pathological long runtime for that task, and then that task
runs too long blocking other tasks from running.
If you are able to identify what in your config is atypical and might
cause long run times, perhaps you can break that config up, streamline
it or otherwise and reduce the time that task needs to run.
The outcome of blocking RPD task is today bit less bad than it was in
older junos, because there is separate process handling keepalives
(PPMd) and there are now few RPD threads, so not everything is
blocked.
--
++ytti
More information about the juniper-nsp
mailing list