Project

General

Profile

Feature #18148

Enable proxy extension for Glideins

Added by Marco Mambelli about 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
11/08/2017
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

Glideins currently evaluate their lifetime and timeouts at the begining, taking into consideration only the length of the proxy available at startup.
When a site reduces the lifetime of the proxy, this reduces the lifetime of the Glidein.

There is a desire to decouple the lifetime of the Glidein from the lifetime of the proxy

Dear Parag, Marco,
Could you please check discussion on the latest updates in this ticket, starting from update #7?

https://ggus.eu/index.php?mode=ticket_info&ticket_id=130585#update#7

Pilots are getting max of 24h lifetime in some sites due to the local condor config which does not allow proxies longer than that. However, in this case they do allow credentials being refreshed by 6h periods (DELEGATE_JOB_GSI_CREDENTIALS_REFRESH).

However, this is of no use for our pilots, given that ToRetire and ToDie timestamps are only calculated once, at the pilot startup, AFAIU.

Site admin claims shortening proxies then renewing them is a sound procedure, long ago introduced to HTCondor, that glideinWMS should take into consideration (not using this capability is even regarded as "a bug" by this particular admin). 

What's your opinion on this?
Cheers, Antonio.

Here is a comment from Brian

Hi,
My opinion wasn't requested, but I'll happily provide it --

The current gWMS implementation reflects the fact that, several years back, the credential renewal / refresh infrastructure largely (a) did not exist and (b) what existed rarely worked.  There wasn't much benefit to doing this "properly", taking into account refreshes.

Now that credential refreshes mostly work*, it would be worthwhile to have this work throughout the gWMS implementation.

Brian

* In fact, sitting in the pre-GDB authorization discussion today, there's strong interest to transition to authz models where the credential lifetime is more O(1 hour) than O(72 hours) we see today.  This reflects how the rest of the world does things.
and
But what Antonio proposed should be fine regardless of whether credential renewal works.  We just want the to-die/retire dares to update *if* the credential is renewed, instead of being static at pilot startup.
Something like:

ToDie = min(credential expiration, startup + 48 hours)

Right now, that is evaluated and added as an integer. It should be an expression.

History

#1 Updated by Marco Mambelli about 2 years ago

From a discussion at the 11/8 GWMS meeting, additional points have been brought up.

HTCondor allows proxy renewals. We need to make sure that all supported gatekeeper-batch system combinations do support it as well (or behave differently depending on the system)

A new model of the Glidein behavior should be defined:
- when is the renewed proxy too late and the Glidein will continue w/ its shutdown
- should the Glidein accept jobs asking for a time longer than its current lifetime counting on the proxy bein renewed?
- should VO submit only shorter jobs?
- should there be jobs that adapt to the time length available (a notification from the Glidein triggers a shutdown in the following N minutes)?

There were mentions to make the lifetime calculation an expression (dynamic) including the proxy lifetime
There were mentions to make the lifetime calculation independent from the proxy lifetime, only depending on the max walltime declared by the system

#2 Updated by Marco Mambelli almost 2 years ago

Adding some comments from the GGUS ticket:
The operators asked a sysadmin to remove the limitations by setting "DELEGATE_JOB_GSI_CREDENTIALS = False" or Keeping it True and extend the lifetime base on the policy, DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME.
He is suggesting to keep the policy in place and modify the framework behavior:

Hi,

we are using the default for the HTCondor settings DELEGATE_JOB_GSI_CREDENTIALS, DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME, and DELEGATE_JOB_GSI_CREDENTIALS_REFRESH.
While default DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME should indeed limit the proxy lifetime to 24h, the default DELEGATE_JOB_GSI_CREDENTIALS_REFRESH should have it renewed every 6h. See also the note on SHADOW_CHECKPROXY_INTERVAL regarding these parameters. Any delegated proxy created by HTCondor should be valid for at least 18h in our system.
Judging from this, the DELEGATE_JOB_GSI_CREDENTIALS* settings should *not* limit the proxy validity for practical purposes. Do you actually see any negative side-effects?

Cheers,
Max 

HTCondor is not triggering automatically the renewal. It does only if the credential in the submitter changes. E.g. a 72 hrs proxy is not resent after 24 hrs (condor is not expecting systems downstream to limit its lifetime); but if a new proxy arrives, then it will be forwarded downstream, so even touching the file will trigger the renewal and keep the system running.



Also available in: Atom PDF