Priority accounting groups
With jobsub_tools, it was possible to specify accounting groups, e.g. nova_high_prio and nova_low_prio. This feature was essential for controlling work flow. It would be helpful to have a similar feature built in to jobsub_client.
#5 Updated by Dennis Box over 5 years ago
Adding Joe and Gerard as watchers on this ticket.
The consensus is that the way this was implemented on gpsn01 (creating high priority accounting groups, having jobsub assign jobs to those groups) is not the best way. Joe, Gerard, are we still planning on using hierarchical quotas to do this? If so, is it far enough along that you know what jobsub needs to do to work with it?
#6 Updated by Gerard Bernabeu Altayo over 5 years ago
yes we are planning on using AccountingGroups in GPGrid condor (instead of the GWMS condor pool) to control/separate/group different job priorities.
I am still trying to collect all users' requirements but I expect the actual AccountingGroups on GPGRID to look like the following example:
The real experts for this are Tony and Krista (both added as watchers).
On the job that actually runs on GPGRID (as of today this would be a pilot job that comes from the jobsub/fifebatch ecosystem) I expect $VO to be introduced by the CE (HTCondorCE) by extracting it from the Job's certificate FQAN (VO from the 1st voms proxy attribute), the "Analysis" and "Production" strings could probably be introduced by the CE the same way. This means that what jobsub needs to somehow provide enough information fill the non-variables below:
batch.$VO.$role #This represents the lower priority queue
highprio.$VO.$role #This should be used only by a restricted number of experiments and with very reduced slots (less than 1%)
At the end of the day what we are implementing here are 4 priority categories:
- low (batch.$VO.$role)
- regular (batch.$VO.$role.regular)
- high (batch.$VO.$role.high)
- realtime (highprio.$VO.$role)
Note that this only makes sense when submitting to GPGRID.
#7 Updated by Dominick Rocco over 5 years ago
The selection of priority categories described by Gerard certainly seems sufficient to handle any use case I can imagine. What are the odds this can come together in the next couple of weeks? NOvA production will have a lot of balls in the air in January and the ability to set priorities would be very helpful. As more users are moving to jobsub_client, production jobs are not receiving the necessary priority.
#8 Updated by Craig Group over 5 years ago
The NOvA production group has begun to migrate to the new jobsub_client. The lack of tools for accounting groups and setting priorities is becoming a major limiting factor for us. I'd like to:
-- understand the timescale for adding this functionality.
-- make sure that this is a high-priority requirement (as opposed to a feature request).
#13 Updated by Craig Group over 5 years ago
We are in a position now on NOvA where we need control over job priority within our production group account (novapro). This is affecting the efficiency of our production effort. I would like to request a report on this in an upcoming NOvA production meeting. Can someone come on Monday and explain the status and plan for this important feature request?
#14 Updated by Parag Mhashilkar over 5 years ago
Unless the quotas are implemented at the cluster level, there is very little we can do in the jobsub. Based on how it is implemented, there is a very good possibility that no changes are required in the jobsub itself. I wish I had a better answer for Nova, but most of the work is outside the scope of jobsub and jobsub is on the receiving side of these changes. As soon as we have a better picture, this will be the highest priority in jobsub if any changes are required.
#15 Updated by Parag Mhashilkar over 5 years ago
After todays meeting this is what jobsub needs to support
- Right now --group=nova results in following the JDF
+AccountingGroup = "group_nova.novapro"
- New requirement in addition to above: if --subgroup=<subgroup> is also specified, should result in following in JDF instead
+AccountingGroup = "group_nova.<subgroup>.novapro"