Investigate starting HTCondor daemons on JobSub server machine as root
- JobSub server runs as user rexbatch.
- HTCondor schedd is started and runs as user rexbatch.
- User jobs are submitted using pilot proxies
- JobSub server as user rexbatch -
- does authentication and authorization for every request from the client
- creates voms proxy using robot certificate on behalf of the user on the JobSub server machine
- uses this voms proxy to perform authentication with the HTCondor daemons for job submission/removal/hold/release/lookup
- submitted jobs have +Owner attribute set to the NIS username in the JDF to make the job owner visible. Without this the
- Issues when HTCondor is started as root and the user accounts are available on the JobSub server machine
- Job submission and other actions succeed but shadow tries to perform tasks as the username even if x509 is used to submit the job. Is this a bug in HTCondor (?) and should it be ignoring the user existence?
- Since the proxy/out/err/log files are owned by rexbatch certain tasks fail
- out/err/log are created by HTCondor as user submitting the job (rexbatch)
- The scheme works when the HTCondor daemons are started and run as rexbatch
New Scheme (Investigate & Implement)¶
- Apache/JobSub server will be started as root and run as rexbatch
- Start HTCondor as root
- HTCondor will run as user condor which is the default
- User accounts will be available on the JobSub server machine without login access. As per one of the original requirements for the JobSub project from the Operations groups, users should not be able to login into the machine and yet the system should function with appropriate security.
- JobSub server (with sudo privileges) interacts with HTCondor as the actual user for job submission/removal/hold/release/lookup
- JobSub server will do the authentication and authorization as today before the giving out the job logs using jobsub_fetchlog.py
- X509 Authentication
- In the new scheme its not clear we will even need x509 to submit jobs. This needs to be investigated and properly thought through.
- JobSub server will be interacting with HTCondor after doing su to the user (needs sudo access).
- Sites need to accept user jobs without x509/voms proxies.
- If we get away with x509 proxies, there will be no need for the cron to keep the proxies updated.
- Its not clear if this will work given that ifdh/Storage requires proxy for data movement.
- Can we use MyProxy to store the proxies? We still need a cron or periodic scheme to update credentials in the MyProxy server.
Implications of new scheme¶
Operations group is responsible
- To setup the JobSub server machine with user accounts available but with no login access
- Create/coordinate account creation for external users that do not have Fermi KCA account. This is provided we decide to go with proxy-less job submission and there is a new/alternate way for the JobSub server to authenticate with the client. There are still proxy for storage restriction that needs to be addressed by someone outside the JobSub project
- Provide sudo access to the user (rexbatch) running JobSub server so that it can perform its tasks. If we decide to go with condor_root_switchboard operations will be responsible for creating and maintaing the privilege separation file.