Project

General

Profile

Idea #9884

Improve calculation of max requested running

Added by Brian Bockelman over 4 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Parag Mhashilkar
Category:
-
Target version:
Start date:
08/25/2015
Due date:
% Done:

0%

Estimated time:
Stakeholders:

CMS

Duration:

Description

Hi,

CMS is seeing a lot of idle glideins in the case where there are very small burst of short-lived jobs.

Currently, the (approximate) algorithm for max requested running is:

max requested running = (# proportional cores idle / # cores in entry) + (# of running glideins)

I'm thinking along the lines of:

max requested running = max( (# proportional job-cores idle / # cores in entry) - # idle glideins, 0) + (# of running glideins)

If all is going well and there are more idle jobs than idle glideins, in the next round, # of idle glideins will go to zero and the tweaked expression is equivalent to the original ones.

However, if the matching expression is wrong (meaning the idle glideins never become running) or the idle jobs all prefer another site (perhaps due to a RANK expression), then this makes the max requested running more conservative.

So, this mostly would help in non-steady-state phase spaces -- which, unfortunately, is where CMS lives.

Brian

History

#1 Updated by Parag Mhashilkar over 4 years ago

  • Assignee set to Parag Mhashilkar
  • Target version set to v3_2_12

#2 Updated by Parag Mhashilkar over 4 years ago

My only concern with this change is, this will slow down the ramp up time for bigger bursts of non-small jobs.

#3 Updated by Parag Mhashilkar over 4 years ago

  • Status changed from New to Feedback
  • Assignee changed from Parag Mhashilkar to Marco Mambelli

Changes are in v3/9884

#4 Updated by Marco Mambelli about 4 years ago

  • Assignee changed from Marco Mambelli to Parag Mhashilkar

ready to be merged

#5 Updated by Parag Mhashilkar about 4 years ago

  • Status changed from Feedback to Resolved

Merged to branch_v3_2 and master

#6 Updated by Parag Mhashilkar about 4 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF