Project

General

Profile

Bug #21031

jobsub_submit allows more than one --group argument

Added by Shreyas Bhat about 1 year ago. Updated 10 months ago.

Status:
Work in progress
Priority:
Normal
Assignee:
Category:
JobSub Client
Target version:
Start date:
11/29/2018
Due date:
% Done:

100%

Estimated time:
(Total: 1.00 h)
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

In RITM0734792, the user gave jobsub_submit two --group arguments, mu2e and highpro respectively. He meant to use --group=mu2e and --subgroup=highpro. What happened, looking at the debug log (attached), was that the jobsub group passed to the jobsub server was mu2e, and --group=highpro was passed in as a server arg.

Note that you can see the user's jobsub_submit command here:

https://fifemon.fnal.gov/monitor/d/000000115/job-cluster-summary?refresh=5m&orgId=1&var-cluster=11951465&var-schedd=jobsub02.fnal.gov

On the schedd, what ended up happening was that the AccountingGroup classad was set to group_highpro.mu2epro for these jobs. Thus, any future attempts at manipulating the job (jobsub_hold, jobsub_rm, jobsub_q) that used the --group=mu2e flag failed to find this job, since (rightly), --group=mu2e passed to jobsub client commands results in -constraint 'regexp("group_mu2e.*",AccountingGroup)' getting added to the resultant condor command.

Please ensure that users cannot submit jobs with a --group server argument (i.e. we can only accept --group as a client argument). We'll need to be careful, too, to make sure that --group is allowed as an executable argument.


Subtasks

Bug #21460: Review request [commit:be986690c039b7b124d881bca92650f04d46449c: Forgot to switch split_client_server_args to look at JobsubClientOption's typed actions]ClosedShreyas Bhat

Bug #21779: Take the unique_store stuff out of release 1.2.9's codeClosedShreyas Bhat

Bug #21783: Review request [commit:d52f4555f3ed53b89e3bb53857260ca779cb599d: Took out JobsubClientOption and unique_store - will move to next release]ClosedDennis Box


Related issues

Blocked by JobSub - Necessary Maintenance #22164: Upgrade jobsub code to use argparse instead of optparseClosed03/19/2019

History

#1 Updated by Shreyas Bhat about 1 year ago

  • Assignee set to Shreyas Bhat

#2 Updated by Dennis Box about 1 year ago

  • Target version set to v1.2.9

#3 Updated by Shreyas Bhat about 1 year ago

  • Category changed from JobSub Server RPM to JobSub Client

Sketched this out. I should be able to put this in and test tomorrow.

Relevant bit here:

https://gist.github.com/shreyb/d4639b2b88103d5ca3559d95d6db94b7

Add that to relevant areas in jobsub_submit.py (parse_opts, main, etc), and it should work.

#4 Updated by Shreyas Bhat about 1 year ago

sbhat@mac-129007:/tmp 16:10:21$ python double_optparse.py -t blah -u mu2e
blah
mu2e
sbhat@mac-129007:/tmp 16:13:00$ python double_optparse.py -t blah -u mu2e -u prod
Usage: double_optparse.py [options]

double_optparse.py: error: -u appears more than once.  Please try again with only one -u option given

Note desired behavior.

#5 Updated by Shreyas Bhat about 1 year ago

  • Status changed from New to Work in progress

#6 Updated by Shreyas Bhat about 1 year ago

  • Status changed from Work in progress to Feedback

I believe this is done. As laid out in the gist above, I subclassed optparse.Option to create a special action, "unique_store", that first checks if an argument passed on the command line is duplicated. If so, it raises an error. If not, it stores the argument.

The changes are saved in a branch named for this ticket (21031), and pushed to redmine.

Testing:

# Control test
[sbhat@fermicloud362 ~]$  ./jobsub/client/jobsub_submit.py -G nova  --debug --jobsub-server=fer
micloud074.fnal.gov  file:///home/sbhat/list_args.sh
SERVER_ARGS:  ['file:///home/sbhat/list_args.sh']
CLIENT_ARGS:  {'dag': False, 'help': False, 'acctRole': None, 'json_config': None, 'jobid_outpu
t_only': False, 'debug': True, 'dropboxServer': None, 'acctGroup': 'nova', 'jobsubServer': 'fer
micloud074.fnal.gov', 'tarball_reject_list': None}
...
JobsubJobId of first job: 240.0@fermicloud074.fnal.gov

Use job id 240.0@fermicloud074.fnal.gov to retrieve output
...
JOBSUB SERVER RESPONSE CODE : 200 (Success)

# Control test 2
[sbhat@fermicloud362 ~]$  ./jobsub/client/jobsub_submit.py -G nova  --debug --jobsub-server=fer
micloud074.fnal.gov  file:///home/sbhat/list_args.sh 1 2 3
SERVER_ARGS:  ['file:///home/sbhat/list_args.sh', '1', '2', '3']
CLIENT_ARGS:  {'dag': False, 'help': False, 'acctRole': None, 'json_config': None, 'jobid_outpu
t_only': False, 'debug': True, 'dropboxServer': None, 'acctGroup': 'nova', 'jobsubServer': 'fer
micloud074.fnal.gov', 'tarball_reject_list': None}
...
JobsubJobId of first job: 241.0@fermicloud074.fnal.gov
...
JOBSUB SERVER RESPONSE CODE : 200 (Success)

# Double --group argument to client (before file://) (should fail)
[sbhat@fermicloud362 ~]$  ./jobsub/client/jobsub_submit.py -G nova  --debug --jobsub-server=fer
micloud074.fnal.gov  --group highpro file:///home/sbhat/list_args.sh 1 2 3
Usage: jobsub_submit.py [Client Options] [Server Options] file://user_script [user_script_args]

Provide --group and --jobsub-server to see full help

jobsub_submit.py: error: --group appears more than once.  Please try submitting again with only
 one --group argument given

# --group argument to client and server (should be OK)
[sbhat@fermicloud362 ~]$  ./jobsub/client/jobsub_submit.py -G nova  --debug --jobsub-server=fer
micloud074.fnal.gov   file:///home/sbhat/list_args.sh 1 2 3 -G highpro
SERVER_ARGS:  ['file:///home/sbhat/list_args.sh', '1', '2', '3', '-G', 'highpro']
CLIENT_ARGS:  {'dag': False, 'help': False, 'acctRole': None, 'json_config': None, 'jobid_outpu
t_only': False, 'debug': True, 'dropboxServer': None, 'acctGroup': 'nova', 'jobsubServer': 'fer
micloud074.fnal.gov', 'tarball_reject_list': None}
...
JobsubJobId of first job: 242.0@fermicloud074.fnal.gov
...
JOBSUB SERVER RESPONSE CODE : 200 (Success)

# Double --group argument to client AND provide it as a server arg (should fail)
[sbhat@fermicloud362 ~]$  ./jobsub/client/jobsub_submit.py -G nova  --debug --jobsub-server=fer
micloud074.fnal.gov  --group highpro file:///home/sbhat/list_args.sh 1 2 3 -G highpro
Usage: jobsub_submit.py [Client Options] [Server Options] file://user_script [user_script_args]

Provide --group and --jobsub-server to see full help

jobsub_submit.py: error: --group appears more than once.  Please try submitting again with only
 one --group argument given

Passing to Dennis for code review.

#7 Updated by Shreyas Bhat about 1 year ago

  • Assignee changed from Shreyas Bhat to Dennis Box
  • % Done changed from 0 to 90

#8 Updated by Dennis Box about 1 year ago

  • Assignee changed from Dennis Box to Shreyas Bhat

feedback sent, will mark resolved after changes made.

#9 Updated by Shreyas Bhat about 1 year ago

Changes made. Now need to test.

#10 Updated by Shreyas Bhat about 1 year ago

Testing done except for jobsub_submit_dag. For some reason, changing the group option to use unique_store breaks it (it ignores the argument being given to it). I have no idea why.

#11 Updated by Shreyas Bhat about 1 year ago

  • Assignee changed from Shreyas Bhat to Dennis Box

Alright - fixed the jobsub_sumbit_dag issue. Sending this back to Dennis for final feedback.

#12 Updated by Shreyas Bhat about 1 year ago

  • Assignee changed from Dennis Box to Shreyas Bhat

#13 Updated by Shreyas Bhat about 1 year ago

  • Due date set to 11/29/2018
  • Start date changed from 10/04/2018 to 11/29/2018

due to changes in a related task: #21460

#14 Updated by Shreyas Bhat about 1 year ago

  • Status changed from Feedback to Accepted

Merged into 1.2.9 rc0 release branch. Marking this accepted until release.

#15 Updated by Dennis Box about 1 year ago

  • Status changed from Accepted to Resolved

#16 Updated by Shreyas Bhat about 1 year ago

  • Status changed from Resolved to Work in progress

User testing showed that if $JOBSUB_GROUP was already defined, unique_store would pick up the default value and throw an error when trying to use the FIRST -G argument.

Have to work on a solution.

#17 Updated by Shreyas Bhat about 1 year ago

Dennis and I discussed this, and we think this might be a good one to combine with the move from optparse to argparse (to make jobsub Python 3 compliant). That would push this out to jobsub 1.3

We want the defaults to be respected, but if we're overriding them, make sure that those overridings agree.

The logic we want is something like this:

Have default dictionary

Option has "changed_from_default" bit, default False

Set all defaults.
For unique_store
If changed_from_default=False:
    If value is being set and it's not set, do it, set changed_from_default=True
    If it is in the dict, and value != previous value, change value, set changed_from_default=True
         else, change value, keep changed_from_default=False
If changed_from_default=True:
    If value is different from previous value --> ERROR
    else: OK

I've sketched this out in a gist at https://gist.github.com/shreyb/350addec3f7d383729e6430a65400411

#18 Updated by Shreyas Bhat about 1 year ago

https://gist.github.com/shreyb/350addec3f7d383729e6430a65400411

Had to redo logic because I wasn't handling cases where arguments could be given in different order. It should be right now.

I'm going to do a very quick rewrite of the test section to make it cleaner (pass list of lists through same testing function). Then Dennis and I can talk about working this into the next release.

#19 Updated by Shreyas Bhat about 1 year ago

Alright - this is now sketched out, with the correct logic (I think). I set up everything as a pytest module so we can run tests on it (pytest -s prints out all the print statements):

https://gist.github.com/shreyb/350addec3f7d383729e6430a65400411

When we're ready to put it in the code, the two classes (JobsubClientNamespace and UniqueStore) will have to be moved to jobsubClient.py, along with the jobsub_defaults and num_change_limit constants.

#20 Updated by Shreyas Bhat 11 months ago

  • Due date set to 01/29/2019

due to changes in a related task: #21779

#21 Updated by Dennis Box 10 months ago

  • Target version changed from v1.2.9 to v1.3

#22 Updated by Shreyas Bhat 9 months ago



Also available in: Atom PDF