Project

General

Profile

Feature #13069

Balancing glidein pressure to sites that are aliases or Meta-Sites

Added by Marco Mambelli almost 3 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
High
Category:
Factory
Target version:
Start date:
06/29/2016
Due date:
% Done:

0%

Estimated time:
Stakeholders:

CMS

Duration:

Description

The pressure to a site is difficult to balance when there is not a one-to-one setup: one site, one gatekeeper (CEs) in front of one cluster.

In actual sites there can be:
- multiple gatekeepers in front from one single cluster, all having its own entry in the factory configuration.
- a single alias and a single entry in the factory configuration that refers to multiple gatekeepers (via DNS or load balancer) that give access to the same cluster
- a single alias and a single entry in the factory configuration that refers to multiple gatekeepers (via DNS or load balancer) that give access to distinct clusters

Here I use the term cluster to refer to a set of worker nodes in the queue of a batch system (aka job manager or local resource manager). This is a bit of a simplification since some worker nodes may actually be shared across multiple queues (that compete for them) or may be acquired dynamically, e.g. via cloud submission or other elastic systems.

Factory administrators prefer to have a reduced number of entries:
- to shorten configuration
- to avoid to have to keep multiple entries consistent

It is important to keep the desired pressure to the site no matter which the configuration

History

#1 Updated by Parag Mhashilkar almost 3 years ago

  • Subject changed from Balancing glidein pressure to sites with non trivial architectures to Balancing glidein pressure to sites that are aliases or Meta-Sites

#2 Updated by Parag Mhashilkar almost 3 years ago

  • Assignee set to Parag Mhashilkar
  • Target version changed from v3_2_x to v3_2_16

#3 Updated by Parag Mhashilkar almost 3 years ago

  • Priority changed from Normal to High

#4 Updated by Parag Mhashilkar over 2 years ago

  • Target version changed from v3_2_16 to v3_2_17

#5 Updated by Parag Mhashilkar over 2 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mambelli

#6 Updated by Parag Mhashilkar over 2 years ago

  • Target version changed from v3_2_17 to v3_2_18

#7 Updated by Marco Mambelli over 2 years ago

  • Target version changed from v3_2_18 to v3_2_19

#8 Updated by Parag Mhashilkar about 2 years ago

  • Assignee changed from Marco Mambelli to Marco Mascheroni

#9 Updated by Marco Mambelli about 2 years ago

  • Target version changed from v3_2_19 to v3_2_20

#10 Updated by Parag Mhashilkar almost 2 years ago

  • Target version changed from v3_2_20 to v3_2_21

#11 Updated by Marco Mascheroni over 1 year ago

I pusched today a first version of this.

Here is the feedback I got after a chat with Marco Mambelli:

  • Add the possibility of st limits for each entry in the entryset (currently entry could only be empty)
  • Improve how load is divided between two different gatekeepers of the same entry set
  • Verify that configuration generation works with merged files as used by ops
  • My comment: move auth_method and trust_domain to the entry_set level

#12 Updated by Marco Mascheroni over 1 year ago

Today ad the GWMS meeting I presented my first implementation (https://docs.google.com/presentation/d/140DOV4E_5VmJyxns_tfcPGYmzohy4pyKMFfI4ASt-j4/edit?usp=sharing)

There were no major comment besides the need of a monitor breakdown per/entry (currently it is per entry_set). I think this requires some work though, So I would like to get a first version out ASAP since Jeff kindly voolounteered to try it out on a test factory at UCSD once we have something to try out. Also, the possibility of setting limits for each entry in the entryset is something we can add later.

Since I just finished improving how load is divided between two different gatekeepers of the same entry set, I would like to do a full validation, and then check out the configuration merge. Then I think we can include this first version in the next release.

#13 Updated by Marco Mambelli over 1 year ago

  • Status changed from New to Resolved

I checked the changes, all suggestions were implemented, I merged w/ branch_v3_2

#14 Updated by Parag Mhashilkar over 1 year ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF