Project

General

Profile

Bug #17074

JobSub ignores --user and --groups flags in favor of --constraint

Added by Bruno Coimbra almost 2 years ago. Updated over 1 year ago.

Status:
Closed
Priority:
Urgent
Assignee:
Category:
-
Target version:
Start date:
06/29/2017
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

When testing jobsub_rm we noticed the flags --user and --group are not generating proper condor constraints.

With the user sbhat being a Minerva superuser and not a Nova superuser "jobsub_rm --user sbhat --group nova --constraint 'JobStatus=?=5' --jobsub-server=fifebatch-dev.fnal.gov" removed all held jobs under Minerva or owned by sbhat ignoring both --user and --group flags. It did not remove jobs not owned by sbhat that weren't Minerva jobs.

Logs make the issue more evident:

[29/Jun/2017:10:03:25] [140215453976320:util.py:doJobAction] [user: sbhat su rexbatch] REMOVE jobs owned by sbhat with constraint(JobStatus=?=5)
[29/Jun/2017:10:03:25] [140215453976320:util.py:doJobAction] [user: rexbatch] REMOVE jobs with constraint (JobStatus=?=5&&(Jobsub_Group =?= "minerva"))

History

#1 Updated by Joe Boyd almost 2 years ago

  • Priority changed from Normal to Urgent

Changing this to urgent for this use case that isn't completely clear above.

If you are superuser of one experiment, and specify a --user flag, it ignores that --user flag and removes all jobs for that experiment.

#2 Updated by Joe Boyd almost 2 years ago

This is a problem because the whole point of the superuser functionality was so the superuser could remove or hold someone elses jobs that were causing a problem. They will assume they can do --user and that removes EVERYONES jobs still.

#3 Updated by Shreyas Bhat almost 2 years ago

Here's the output from our latest tests:

-bash-4.1$ jobsub_q -G minerva --user sbhat --constraint 'JobStatus=?=5' --jobsub-server=fifebatch-dev.fnal.gov
JOBSUBJOBID                           OWNER           SUBMITTED     RUN_TIME   ST PRI SIZE CMD
12802.0@fife-jobsub-dev01.fnal.gov    sbhat           07/06 15:44   0+00:00:00 H   0   0.0 basicsleep_10m.sh_20170706_154417_3781460_0_1_wrap.sh 
12804.0@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.1@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.2@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.3@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.4@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.5@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.6@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.7@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.8@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.9@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12805.0@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.1@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.2@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.3@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.4@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.5@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.6@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.7@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.8@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 
12805.9@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:50   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_155013_3782900_0_1_wrap.sh 

21 jobs; 0 completed, 0 removed, 0 idle, 0 running, 21 held, 0 suspended

Note that we got boyd's jobs as well as sbhat's.

Doing an rm:

-bash-4.1$ jobsub_rm -G minerva --user sbhat --constraint 'JobStatus=?=5' --jobsub-server=fifebatch-dev.fnal.gov
removing jobs owned by sbhat
removing jobs with constraint=JobStatus=?=5
Schedd: fife-jobsub-dev01.fnal.gov
10 Succeeded, 0 Failed, 0 Not Found, 0 Bad Status, 0 Already Done, 0 Permission Denied

Schedd: fife-jobsub-dev02.fnal.gov
0 Succeeded, 0 Failed, 0 Not Found, 0 Bad Status, 0 Already Done, 0 Permission Denied

ERROR:
condor_rm:0:There are no jobs in the queue

-bash-4.1$ jobsub_q -G minerva --user sbhat --constraint 'JobStatus=?=5' --jobsub-server=fifebatch-dev.fnal.gov
JOBSUBJOBID                           OWNER           SUBMITTED     RUN_TIME   ST PRI SIZE CMD
12802.0@fife-jobsub-dev01.fnal.gov    sbhat           07/06 15:44   0+00:00:00 H   0   0.0 basicsleep_10m.sh_20170706_154417_3781460_0_1_wrap.sh 
12804.0@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.1@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.2@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.3@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.4@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.5@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.6@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.7@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.8@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 
12804.9@fife-jobsub-dev01.fnal.gov    boyd            07/06 15:44   0+00:00:00 H   0   0.0 sleep_job.sh_20170706_154429_3781984_0_1_wrap.sh 

11 jobs; 0 completed, 0 removed, 0 idle, 0 running, 11 held, 0 suspended

This removed boyd's minerva jobs as well as sbhat's because sbhat is a minerva group_superuser.

#4 Updated by Dennis Box almost 2 years ago

  • Assignee set to Dennis Box
  • Target version set to v1.2.4

#5 Updated by Dennis Box almost 2 years ago

  • Status changed from New to Resolved

#6 Updated by Dennis Box almost 2 years ago

Hi Bruno, I updated the SNOW request that generated this ticket but forgot to update this one. The fix is installed on fifebatch-dev. Can you test it?
Thanks
Dennis

#7 Updated by Dennis Box over 1 year ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF