Project

General

Profile

Bug #5351

manage-glidein crashing when command invocation fails

Added by Marco Mambelli over 5 years ago. Updated over 5 years ago.

Status:
Closed
Priority:
Normal
Category:
Ini-Installer
Target version:
Start date:
02/05/2014
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

In a running installation with the factory and wmscollector or the frontend and usersubmit issue as non root user

manage-glideins --stop all --ini stupFile.ini

This will result in a stack trace when trying to stop one of the condor daemons, e.g.:

############################################################
## stop submit user(gcondor) node(fermicloud089.fnal.gov)
Commands: . /local/home/gcondor/master/usersubmit/condor.sh; /local/home/gcondor/master/usersubmit/condor stop
PROGRAM ERROR: uncaught execption. Contact support
Traceback (most recent call last):
  File "/local/home/factory/master/glideinwms/install/manage-glideins", line 458, in main
    process_request(service,gAction,inifile)
  File "/local/home/factory/master/glideinwms/install/manage-glideins", line 199, in process_request
    submit(service,action,inifile)
  File "/local/home/factory/master/glideinwms/install/manage-glideins", line 116, in submit
    run_command(service,obj.username(),obj.hostname(),cmd)
  File "/local/home/factory/master/glideinwms/install/manage-glideins", line 51, in run_command
    common.logit(stdout)
UnboundLocalError: local variable 'stdout' referenced before assignment
None    

The reason was that regular users cannot stop the condor daemons:
[factory@fermicloud089 ~]$ . /local/home/gcondor/master/usersubmit/condor.sh; /local/home/gcondor/master/usersubmit/condor stop
ERROR: GSI self authentication will fail.
Check these Condor attributes and verify ownership.
GSI_DAEMON_CERT GSI_DAEMON_KEY GSI_DAEMON_PROXY
Or you may be starting/stopping Condor as the wrong user.
You should be starting as user: root
You are trying to start as user: factory
[factory@fermicloud089 ~]$ $?
-bash: 1: command not found

My proposal:
1. initialize stdout before the invocation to avoid the exception
2. maybe recover stdout from glideinwms.lib.subprocessSupport.iexe_cmd to explain to the user the reason of the error
3. add a check at the beginning of manage-glideins that if sould be invoked as root


Related issues

Related to GlideinWMS - Bug #4574: Possible non-critical likely problem in subprocessSupport.py Closed08/23/2013

Related to GlideinWMS - Bug #5107: Issues running reconfig_factory and reconfig_frontend directlyClosed12/17/2013

History

#1 Updated by Parag Mhashilkar over 5 years ago

  • Category set to Ini-Installer
  • Target version set to v3_2_5

1 & 2 was observed by John while starting the services. Clearly a bug in manage glideins than anything. From the changes made in #5107, reconfig commands already check if they are run as root. So your 3. is partly taken care off. Maybe we want to extend that approach to starting the services as well. It may still be ok to stop the services as root.

#2 Updated by Marco Mambelli over 5 years ago

1. Has been fixed in 5071.

Opened branch master_5071_5351 (off master_5071)
2. fixed printing data provided w/ exception

3. the correct user may depend on the installation (RPM/tarball), the service, who started the service.
Controls have been added to the startup scripts and it is sufficient to report the error (as in 2).

Ready for testing and review

#3 Updated by Marco Mambelli over 5 years ago

  • Status changed from New to Feedback
  • Assignee changed from Marco Mambelli to Parag Mhashilkar

Code committed and tested.
I noticed that manage-glidein was not finding the ini file if invoked from a different directory, e.g.:
$ glideinwms/install/manage-glideins --stop all --ini ini/glideinWMS-singlenode.ini.master_5071
ERROR: ini file does not exist: ini/glideinWMS-singlenode.ini.master_5071

I fixed that as well.

#4 Updated by Marco Mambelli over 5 years ago

Back-ported in v3/5071_v2.
These changes will be included in the one of ticket 5071

#5 Updated by Marco Mambelli over 5 years ago

  • Status changed from Feedback to Closed
  • Assignee changed from Parag Mhashilkar to Marco Mambelli
  • Target version changed from v3_2_5 to v3_2_4

Feedback received
See 5071 for details. Merged in branch_v3_2, ready for v3_2_4rc1 release



Also available in: Atom PDF