Project

General

Profile

Bug #4587

OSG: Automate / fix HTCondor-CE issues

Added by Brian Bockelman almost 6 years ago. Updated 12 months ago.

Status:
Assigned
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
08/26/2013
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:

OSG

Duration:

Description

There are two issues we discovered when validating 3_2_X for HTCondor-CE support.

First, we need to map currently-unmapped hostcerts by default for the factory. Here's what we do locally in the mapfile as a catch-all:

GSI "^\/DC\=org\/DC\=doegrids\/OU\=Services\/CN\=(host\/)?([A-Za-z0-9.\-]*)$" \
GSI "^\/DC\=com\/DC\=DigiCert-Grid\/O=Open Science Grid\/OU\=Services\/CN\=(host\/)?([A-Za-z0-9.\-]*)$" \

Alternately, we can remove:

DENY_CLIENT = anonymous@*

by setting:

C_GAHP_WORKER_THREAD.DENY_CLIENT =

Personally, I prefer the mapfile approach, perhaps combined with:

C_GAHP_WORKER_THREAD.ALLOW_CLIENT = @daemon.opensciencegrid.org/

The second issue is unsetting GSI_DAEMON_NAME for the c_gahp. If the C_GAHP<->C_GAHP_WORKER_THREAD communication is not based on GSI (i.e., uses FS or something), we could do:

C_GAHP_WORKER_THREAD.GSI_DAEMON_NAME=

This is fairly high priority for OSG - we're releasing the HTCondor-CE on September 10 and would like this to work with the factory as soon as possible. If it's not possible to time with a factory release, it'd be OK if we start by reviewing this for manual inclusion in the configs.

History

#1 Updated by Igor Sfiligoi almost 6 years ago

Adding a catch all rule is a very bad idea;
it will allow anyone with a host cert to perform a Man-In-The-Middle attack.

We need something that does what we need;
I guess the HOST_CHECK is good enough for most use cases (just as it was for GRAM),
but this must co-exist peacefully with the rest of the Condor security mechanisms.

The idea of having a C_GAHP specific config seems to be going in the right direction,
even though it seems error prone.
But could be a good-enough stopgap measure until we settle for something better with the Condor team.

#2 Updated by Brian Bockelman almost 6 years ago

Adding the catch-all would also have to go along with

ALLOW_WRITE = @$(FULL_HOSTNAME)/

It seems that that gWMS doesn't have very elaborate authorization -- any authenticated user is allowed to do anything?

Alternately, would this work:

C_GAHP_WORKER_THREAD.GSI_DAEMON_NAME=
C_GAHP_WORKER_THREAD.DENY_CLIENT=

? That should allow any remote server which passes the host check to work. Why is DENY_CLIENT set in the first place, does anyone recall?

My only suggestion for the HTCondor team is to make GSI_SKIP_HOST_CHECK_CERT_REGEX into a list of regex's -- this would allow for much easier conversion for current GSI_DAEMON_NAME users.

#3 Updated by Igor Sfiligoi almost 6 years ago

The DENY is there to prevent remote authentication to services that do not match the whitelist.
(we have no catch all)

#4 Updated by Igor Sfiligoi almost 6 years ago

As for "any authenticated user is allowed to do anything"...
yes....

However, so far, the only thing that an authenticated user could do was post a classad in the collector.
And we have an application level map/whitelist for that.

Condor-CE adds a whole new level of complexity, security wise.

#5 Updated by Brian Bockelman almost 6 years ago

So - all the more argument for isolating the changes to the C_GAHP_WORKER_THREAD subsystem, right?

#6 Updated by Igor Sfiligoi almost 6 years ago

Yep, as stated in my first post ;)

#7 Updated by Burt Holzman almost 6 years ago

  • Target version set to v3_x

#8 Updated by Burt Holzman almost 6 years ago

  • Status changed from New to Assigned
  • Assignee set to Burt Holzman
  • Target version changed from v3_x to v3_2_1

So I'd like to do a little more investigation here before we make this officially part of the install. It sounds like we've converged on blanking the settings for C_GAHP_WORKER_THREAD.{GSI_DAEMON_NAME,DENY_CLIENT}, but we should carefully check to ensure it doesn't weaken other parts of the security infrastructure.

#9 Updated by Burt Holzman almost 6 years ago

  • Target version changed from v3_2_1 to v3_2_2

#10 Updated by Burt Holzman almost 6 years ago

  • Target version changed from v3_2_2 to v3_2_3

#11 Updated by Burt Holzman over 5 years ago

  • Target version changed from v3_2_3 to v3_2_x

#12 Updated by Igor Sfiligoi over 5 years ago

For the record:
This is what is currently used on tho OSG/CMS-run glidein factories:

GSI_SKIP_HOST_CHECK = False
C_GAHP_WORKER_THREAD.GSI_DAEMON_NAME=
C_GAHP_WORKER_THREAD.DENY_CLIENT=

BTW: In related news, we also had to add

DELEGATE_JOB_GSI_CREDENTIALS_LIFETIME = 345600

since by default it would have shortened it to 24h.

#13 Updated by Parag Mhashilkar over 4 years ago

  • Stakeholders updated (diff)

#14 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_2_x to v3_4_x

#15 Updated by Marco Mambelli 12 months ago

  • Target version changed from v3_4_x to v3_5_x


Also available in: Atom PDF