Review the choice of fixed vs partitionable slots
Currently the choice of fixed vs partitionable slots is done in:
- factory configuration, using slots_layout="partitionable" in the config/submit section of the entry configuration
- in the frontend configuration setting the attribute SLOT_LAYOUT to partitionable (should not be passed as paramenter)
- in the frontend configuration setting the attribute FORCE_PARTITIONABLE to True
GLIDEIN_CPUS is involved as well: the script creation/web_base/smart_partitionable.sh sets the layout back to fixed if GLIDEIN_CPUS is 1 or is unset and FORCE_PARTITIONABLE is not True.
The exact interaction of the parameters should be documented.
Maybe the behavior of creation/web_base/smart_partitionable.sh should be changed to be more partitionable friendly (even if the condor team just affirmed that partitionable slots are not production ready).
This came out working on #10092 because when adding extra resources to the main slot there may be a startd failure due to impossible layout when using fixed slots.
#5 Updated by Marco Mambelli over 3 years ago
GLIDEIN_CPUS is the number of cores available for the glidein (that affect the number of cores made available to the jobs by the starts - NUM_CPUS). Things will remain the same if you specify a number. There can be also some keywords: node (all cores in the node - detected), slot (all cores in the WFM slot of the glidein - detected), auto (same as node currently).
These are the proposed changes for GLIDEIN_CPUS defaults (that affect the number of cores - NUM_CPUS):
Current Proposed change
Default 1 slot
auto node slot
Changes are meant to make GWMS work better on multicore system where the glidein is not getting the whole node
Slots of the glidein can be fixed (N slots of 1 core each), partitionable (one slot with all cores that can be allocated dynamically).
These are the proposed changes:
- Default will be partitionable (was fixed). i.e. if nothing is specified one 8 cores node will be kept as 1 partitionable slot with 8 cores instead of 8 1 core slots.
- If you define special resources (e.g. GPUs) and add them to te main slots, the main slot will be partitionable even if you selected fixed. This is done to avoid startd errors if the number of spacial resources does not match the one of cores
- removing the forced conversion to fixed for partitionable slots with 1 core
These changes are to move GWMS more towards partitionable slots that have been used for a while and allow better allocation of cores, memory and other resources.