Our MaxJobRetirementTime expression does not work with WANT_HOLD
Since 7.3.X Condor allows to put in an expression that will hold a misbehaving job directly from the startd.
The knob to turn is
(see also #2541)
However, when WANT_HOLD evaluates to True, the startd will wait for
for the job to go away.
Our extremely large default does not work well with this;
the desired behaviour is to kick out the job on a short timescale.
We currently define
I think would should use something along the line of
and have a default of
PS: As usual, we should consider if the *_GRACE_TIME time should be piecemeal (like START) or defined directly (like PREEMPT_GRACE_TIME is now)