Project

General

Profile

Feature #11398

CMS would like to keep minimum idle glideins on a highly available site at all times irrespective of jobs in the queue

Added by Parag Mhashilkar almost 4 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
01/11/2016
Due date:
% Done:

0%

Estimated time:
Stakeholders:

CMS

Duration:

Description

For certain large sites, CMS would like to keep idle glideins on all the time so that new jobs can start running right away. Investigate if this can be done using relative to the queue if not find/implement a mechanism to achieve this.

History

#1 Updated by Parag Mhashilkar almost 4 years ago

  • Target version set to v3_2_13

#2 Updated by Parag Mhashilkar almost 4 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mascheroni
  • Target version changed from v3_2_13 to v3_2_14

#3 Updated by Parag Mhashilkar over 3 years ago

  • Target version changed from v3_2_14 to v3_2_15

#4 Updated by Marco Mascheroni over 3 years ago

  • Assignee changed from Marco Mascheroni to Parag Mhashilkar

#5 Updated by Marco Mascheroni over 3 years ago

  • Status changed from New to Feedback

#6 Updated by Parag Mhashilkar over 3 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mascheroni

You are correctly changing the compute_glidein_min_idle() and compute_glidein_max_running() However, the changes wont give you the results you want.

  • You are computing min_idle based on <idle_glideins_per_entry max="100" reserve="5" /> Whereas what you really want is to only consider reserve from <running_glideins_per_entry max="2000" relative_to_queue="1.15" reserve="10"/>
  • If someone accidentally fat-fingers reserve in either of the above tags to be a really big number, all of a sudden you could have frontend requesting a large number of glideins without requesting any limits. You may want to consider running-reserve as if there are these many idle jobs in the queue.

I am thinking along the following lines

  • If count_status[Total] or count_status[idle] (or maybe effective_idle ?? -- need more though here) is less than running-reserve, adjust these counts and consider the adjusted values accordingly. This way you will always go through the process of applying thresholds.
  • self.reserve_idle has a different semantics and should not be used. What you want is, glideins are running at the site and not just sitting and accounted towards idle count.

#7 Updated by Marco Mascheroni over 3 years ago

  • Assignee changed from Marco Mascheroni to Parag Mhashilkar

#8 Updated by Parag Mhashilkar over 3 years ago

  • Status changed from Feedback to Resolved
  • Assignee changed from Parag Mhashilkar to Marco Mascheroni

#9 Updated by Parag Mhashilkar over 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF