Project

General

Profile

Feature #16161

Estimate the cores provided to glideins running on an entry

Added by Marco Mambelli over 2 years ago. Updated about 1 year ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
04/11/2017
Due date:
% Done:

0%

Estimated time:
Stakeholders:

Factory Ops, CMS

Duration:

Description

A Factory admin can set statically the cores assumed to be provided to an entry by using GLIDEIN_CPUS.
GLIDEIN_CPUS allows also the keywords auto/slot/node to autodiscover the cores received.
This works relatively well once on the node: HTCondor is configured to use the cores received.

In the Frontend, before a glidein starts on the entry, we don't know how much it will receive so we assume 1 (in getGlideinCpusNum).

This causes some problems:
- Frontend will not start glideins for multicore job requests on auto/slot/node entries
- Frontend will overestimate the number of glideins to request (causing long adjustments and waste of resources - the bigger the more the cores)

Solutions would estimate the cores that will be received, e.g.:
- parsing expressions in the glidein submission
- letting admins add an estimated_glidein_cpus
- maintaining stats for the entry (max, min, avg # of cores received)


Related issues

Blocks glideinWMS - Feature #19946: Factory Operations suggestions summaryNew2018-05-14

History

#1 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_x to v3_2_23
  • Stakeholders updated (diff)

#2 Updated by Marco Mambelli over 1 year ago

  • Assignee set to Marco Mambelli

#3 Updated by Marco Mambelli about 1 year ago

  • Target version changed from v3_2_23 to v3_4_0

#4 Updated by Marco Mambelli about 1 year ago

  • Status changed from New to Work in progress

Changes in v34/16161. Should be complete, ready for testing and feedback

#5 Updated by Marco Mambelli about 1 year ago

#6 Updated by Lorena Lobato Pardavila about 1 year ago

  • Status changed from Work in progress to Feedback
  • Assignee changed from Marco Mambelli to Lorena Lobato Pardavila

#7 Updated by Marco Mambelli about 1 year ago

There is no actual way to verify that multiple attr in the attrs section are conflicting w/ each other, so:
- the documentation states: "It is a configuration error to set GLIDEIN_CPUS to a number>0 and set GLIDEIN_ESTIMATED_CPUS"
- but nothing will return an error if someone actually does it
- GLIDEIN_ESTIMATED_CPUS is just ignored

Added a TODO note in factoryXmlConfig.py:
  1. TODO: e.g. GLIDEIN_ESTIMATED_CPUS should not be set if GLIDEIN_CPUS is not set or GLIDEIN_CPUS > 0

#8 Updated by Marco Mambelli about 1 year ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Lorena Lobato Pardavila to Marco Mambelli

#9 Updated by Marco Mambelli about 1 year ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF