Project

General

Profile

Feature #3422

Add downtime management of the Frontend

Added by Igor Sfiligoi almost 7 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
Frontend
Target version:
Start date:
02/04/2013
Due date:
% Done:

100%

Estimated time:
Stakeholders:
Duration:

Description

The Frontend does not have any downtime mechanism.

At the very least, we should add the downtime of the full FE...
with the semantics of continuing to monitor the system and re-delegate proxies, but stop requesting new glideins.

This is needed. e.g. when one has to drain the glidein pool for maintenance.


Related issues

Related to GlideinWMS - Feature #3423: Frontend should be able to temporarily blacklist factory entriesNew02/04/2013

History

#1 Updated by Burt Holzman almost 7 years ago

  • Assignee set to Burt Holzman

#2 Updated by Burt Holzman almost 7 years ago

  • Target version set to v3_1

#3 Updated by Burt Holzman almost 7 years ago

  • Target version changed from v3_1 to v3_x

#4 Updated by Parag Mhashilkar almost 5 years ago

  • Assignee changed from Burt Holzman to HyunWoo Kim

#5 Updated by Parag Mhashilkar over 4 years ago

  • Target version changed from v3_x to v3_2_12

#6 Updated by HyunWoo Kim over 4 years ago

  • % Done changed from 0 to 10

I am trying to understand the relevant scripts that are used for start and stop of Frontend and Factory
I will soon start reading manageFactoryDowntimes.py in Factory based on which I will write Frontend version of downtime management code.

#7 Updated by Parag Mhashilkar over 4 years ago

  • Priority changed from Normal to Low

#8 Updated by HyunWoo Kim about 4 years ago

  • % Done changed from 10 to 90

I have added 2 new files
- manageFrontendDowntimes.py : this is an executable called by /etc/init.d/gwms-frontend. This mainly handles the downtime text file
- glideinFrontendDowntimeLib: this is a library used by the above file and also glideinFrontendElement.py

I also have modified 4 files:
- /etc/init.d/gwms-frontend
- cvWParams.py : to load a new variable for downtime
- cvWParamsDict.py: to define a new variable
- glideinFrontendElement.py: iterate_one method in this file will basically check if the Frontned is in downtime right before it advertizes glideclient classad to Factory

I have tested this new feature in my test instances of Frontend and Factory.

I have committed the new files and changes to v3/3422 branch

I will go through one final check before I put this ticket under peer review..

#9 Updated by Parag Mhashilkar about 4 years ago

  • Target version changed from v3_2_12 to v3_2_13

#10 Updated by Parag Mhashilkar almost 4 years ago

  • Target version changed from v3_2_13 to v3_2_14

#11 Updated by HyunWoo Kim almost 4 years ago

  • Status changed from New to Feedback
  • Assignee changed from HyunWoo Kim to Marco Mambelli

#12 Updated by Marco Mambelli over 3 years ago

  • Assignee changed from Marco Mambelli to HyunWoo Kim

Feedback emailed on 5/12

#13 Updated by HyunWoo Kim over 3 years ago

Today, I reflected Marco's review comments.
and tested the code in my own Frontend again and confirmed that everything is working.

I will push to the remote.

One thing to note is that I need to talk with Marco tomorrow morning
about how to update the packaging process to copy
/usr/lib/python2.6/site-packages/glideinwms/frontend/manageFrontendDowntimes.py
to /usr/sbin/.
This file manageFrontendDowntimes.py is a new one and needed by /etc/init.d/gwms-frontend script.

Once this issue is resolved tomorrow, I will merge this branch v3/3422 into the branch_v3_2

#14 Updated by HyunWoo Kim over 3 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 90 to 100

Talked with Marco this morning.
Marco has modified rpm spec file to copy /usr/lib/python2.6/site-packages/glideinwms/frontend/manageFrontendDowntimes.py
to /usr/sbin/.
I updated the Frontend install document for this new downtime feature.

Finally, I merged v3/3422 into branch_v3_2
and pushed branch_v3_2 to the remote.

#15 Updated by Parag Mhashilkar over 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF