Project

General

Profile

Support #21929

Follow up w/ OSG and HTCondor to allow a clean exit in PBS

Added by Marco Mambelli over 1 year ago. Updated 15 days ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
02/19/2019
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

As documented in #21682, when removing a job submitted to a PBS system, the signal is now sent correctly to condor that receives it and shuts down.

PBS still sends sigterm and sigkill only few milliseconds later.
This is enough for the trap to forward the first signal but not for the process termination (sending back logs, ...) and cleanup.

Either (1) a working parameter is found to increase the delay in PBS
OR (2) Either HTCondorCE/BLAHP or HTCondor will take advantage of qsig that allows to send a signal and do that before removing the job.
Solution (2) would have the advantage to control the signal use and distinguish a quick shutdown (sigquit) form a graceful one(sigterm)

The role of GlideinWMS here is to facilitate and coordinate and verify the solution.
I don't think changes in GWMS would be of help.

The advantage for GWMS would be to receive glidien log files also in the case of killed jobs

History

#2 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_5 to v3_5_1

#3 Updated by Marco Mambelli about 1 year ago

  • Target version changed from v3_5_1 to v3_6_1

#4 Updated by Marco Mambelli about 1 year ago

  • Target version changed from v3_6_1 to v3_6_2

#5 Updated by Marco Mambelli 10 months ago

  • Target version changed from v3_6_2 to v3_6_3

#6 Updated by Marco Mambelli 6 months ago

  • Target version changed from v3_6_3 to v3_6_4

#7 Updated by Marco Mambelli about 1 month ago

  • Target version changed from v3_6_4 to v3_6_5

#8 Updated by Marco Mambelli 20 days ago

  • Target version changed from v3_6_5 to v3_6_6

#9 Updated by Marco Mambelli 15 days ago

  • Status changed from New to Closed

Closing the ticket

No progress possible form the GlideinWMS side



Also available in: Atom PDF