Estimate the cores provided to glideins running on an entry
Factory Ops, CMS
A Factory admin can set statically the cores assumed to be provided to an entry by using GLIDEIN_CPUS.
GLIDEIN_CPUS allows also the keywords auto/slot/node to autodiscover the cores received.
This works relatively well once on the node: HTCondor is configured to use the cores received.
In the Frontend, before a glidein starts on the entry, we don't know how much it will receive so we assume 1 (in getGlideinCpusNum).
This causes some problems:
- Frontend will not start glideins for multicore job requests on auto/slot/node entries
- Frontend will overestimate the number of glideins to request (causing long adjustments and waste of resources - the bigger the more the cores)
Solutions would estimate the cores that will be received, e.g.:
- parsing expressions in the glidein submission
- letting admins add an estimated_glidein_cpus
- maintaining stats for the entry (max, min, avg # of cores received)
#7 Updated by Marco Mambelli 11 months ago
There is no actual way to verify that multiple attr in the attrs section are conflicting w/ each other, so:
- the documentation states: "It is a configuration error to set GLIDEIN_CPUS to a number>0 and set GLIDEIN_ESTIMATED_CPUS"
- but nothing will return an error if someone actually does it
- GLIDEIN_ESTIMATED_CPUS is just ignored
- TODO: e.g. GLIDEIN_ESTIMATED_CPUS should not be set if GLIDEIN_CPUS is not set or GLIDEIN_CPUS > 0