Feature #19791
Implement a retry policy in Condor to allow jobs that go held for excess memory usage to have their memory increased and released
Start date:
04/24/2018
Due date:
% Done:
0%
Estimated time:
Stakeholders:
Description
Implement a retry policy in Condor to allow jobs that go held for excess memory usage to have their memory increased and released. Details copy/pasted from SNOW RITM0638375:
After discussing with HTCondor team here is what you want in the condor JDL. I haven't tested it so someone (Dennis or Marc or Shreyas) needs to test it.
# Supporting attributes that need to be added to JDL are shown by + sign# OriginalMemory = <Value of --memory>
# GraceMemory = Increment request_memory by this amount if job was put on
+OriginalMemory = 2000
+GraceMemory = 2000
+MaxAllowedMemory = $(OriginalMemory) + $(GraceMemory)
request_memory = ifthenelse(isUndefined(MemoryUsage), OriginalMemory, MaxAllowedMemory)
periodic_release = (HoldReasonCode =?= 34) && (RequestMemory < MaxAllowedMemory)
History
#1 Updated by Shreyas Bhat almost 3 years ago
- Target version set to v1.2.8
#2 Updated by Dennis Box over 2 years ago
- Target version changed from v1.2.8 to v1.3
#3 Updated by Dennis Box almost 2 years ago
- Target version changed from v1.3 to v1.3.3
#5 Updated by Dennis Box 3 months ago
- Target version changed from v1.3.3 to v1.3.4