- Table of contents
- jobsub_q
- Explanation of Job Statuses
- Example: jobsub_q (no options)
- Example: jobsub_q --summary
- Example: narrow the query down with --user:
- Example: Narrow the query down with --group:
- Example: Querying the state of a previously submitted job:
- Example: Determining who submitted using a group account (and from where)
- Example: using --better-analyze to get more information on why a job hasn't started
jobsub_q¶
Jobsub_q is an analog of condor_q for querying the job queue on the jobsub-server
$ ]$ jobsub_q --help Usage: jobsub_q [Client Options] Options: --version show program's version number and exit Client Options: -G <Group/Experiment/Subgroup>, --group=<Group/Experiment/Subgroup> Group/Experiment/Subgroup for priorities and accounting --jobsub-server=<JobSub Server> Alternate location of JobSub server to use --role=<VOMS Role> VOMS Role for priorities and accounting --user=<user ID> User Id to query --jobid=<Job ID> Job Ids to query. Job Ids have format of (cluster).(process)@(schedd_name). If Job id has a decimal point but no process (example) 1234.@fifebatch.fnal.gov then ALL job ids with that cluster and schedd name will be returned --long show long listing (like condor_q -l) --dag show dags (like condor_q -dag) --hold show held jobs (like condor_q -hold) --run show running jobs (like condor_q -run) --idle show idle jobs --constraint=<constraint> like condor_q -constraint <constraint> --summary provide a summary (like ifront_q) --debug Print debug messages to including server contacted, http response, response time --better-analyze do condor_q -better-analyze on job (must use with --jobid) -h, --help Show this help message and exit Please direct questions, comments, or problems to the service desk. For help on --jobid or --constraint see https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki/Frequently_Asked_Questions
Explanation of Job Statuses¶
The job statuses displayed by jobsub_q are pulled directly from the underlying batch system, HTCondor. The codes are:
Code | Status | Letter Code (shows up in jobsub_q) |
---|---|---|
0 | Unexpanded | U |
1 | Idle | I |
2 | Running | R |
3 | Removed | X |
4 | Completed | C |
5 | Held | H |
6 | Transferring_Output | > |
7 | Suspended | S |
These were pulled from the HTCondor Wiki.
The Unexpanded state means the job is being inserted into the system and should generally be brief. Jobs start in the Idle state and go to Running, but may go back to Idle if the job is interrupted. If the job does revert to Idle, it will eventually go back to Running again, and no action is required from the user. Transferring_output refers to transferring the condor-controlled files, not the user transfers with ifdh. Suspended is a rare state that fifebatch will not recover from, so the user should consider this a terminal failure.
The only status that requires a user to take action is Held (H). In this case, please consult the Fifemon Why are my jobs held? dashboard to determine the hold reason, and take appropriate action (most of the time, jobsub_release or jobsub_rm to release or remove the job).
Example: jobsub_q (no options)¶
Using jobsub_q without the --group option shows all the queued and running jobs on both servers. If there are a lot of queued jobs you might wait a while, so don't do this.
Instead, look at http://fifemon.fnal.gov/monitor/pool/fifebatchgpvmhead1
$ jobsub_q JOBSUBJOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 390257.0@fifebatch2.fnal.gov jbrodsky 12/19 16:18 0+00:00:00 I 0 0.0 gridrun_g4ds.sh_20141219_161854_11359_0_1_wrap.sh 390258.0@fifebatch2.fnal.gov jbrodsky 12/19 16:20 0+00:00:00 I 0 0.0 gridrun_g4ds.sh_20141219_162003_11599_0_1_wrap.sh 2016928.0@fifebatch2.fnal.gov kherner 06/09 05:39 0+00:00:00 I 0 0.0 submission_test.sh_20150609_053935_18329_0_1_wrap.sh 2016928.1@fifebatch2.fnal.gov kherner 06/09 05:39 0+00:00:00 I 0 0.0 submission_test.sh_20150609_053935_18329_0_1_wrap.sh 2016928.2@fifebatch2.fnal.gov kherner 06/09 05:39 0+00:00:00 I 0 0.0 submission_test.sh_20150609_053935_18329_0_1_wrap.sh 2016928.3@fifebatch2.fnal.gov kherner 06/09 05:39 0+00:00:00 I 0 0.0 submission_test.sh_20150609_053935_18329_0_1_wrap.sh 2016928.4@fifebatch2.fnal.gov kherner 06/09 05:39 0+00:00:00 I 0 0.0 submission_test.sh_20150609_053935_18329_0_1_wrap.sh 2077003.0@fifebatch2.fnal.gov kherner 06/14 20:26 0+00:00:00 I 0 0.0 submission_test.sh_20150614_202628_6331_0_1_wrap.sh 2248016.0@fifebatch2.fnal.gov aluca 06/19 09:10 98+05:15:05 R 0 0.3 condor_dagman [ ~133000 rows deleted for brevity ] 3035805.9993@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9994@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9995@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9996@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9997@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9998@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035805.9999@fifebatch1.fnal.gov mu2epro 09/25 14:14 0+00:00:00 I 0 0.0 mu2eprodsys.sh_20150925_141422_15265_0_1_wrap.sh 3035806.0@fifebatch1.fnal.gov skemboi 09/25 14:14 0+00:00:00 I 0 0.0 meh_001.sh_20150925_141425_15728_0_1_wrap.sh 133043 jobs; 0 completed, 1 removed, 101336 idle, 28641 running, 3062 held, 0 suspended
Example: jobsub_q --summary¶
If you want to know the state of the servers, a better option is jobsub_q --summary
It is still a lot of output, but it comes back much faster and is better organized by user/group
Again, look at http://fifemon.fnal.gov/monitor/pool/fifebatchgpvmhead1 instead
$ jobsub_q --summary Name Machine RunningJobs IdleJobs HeldJobs group_annie.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_argoneut.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_cdf.adriutti@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_cdf.aluca@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 1 group_cdf.ctosciri@fifebatch1.fnal.gov fifebatch1.fnal.gov 100 0 0 group_cdf.dbox@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_cdf.dfitz11@fifebatch1.fnal.gov fifebatch1.fnal.gov 298 0 15 group_cdf.kotwal@fifebatch1.fnal.gov fifebatch1.fnal.gov 163 0 26 group_cdf.ptl@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 1 group_cdf.seiya@fifebatch1.fnal.gov fifebatch1.fnal.gov 50 0 0 group_cdf.sganguly@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 146 group_darkside.aldenf@fifebatch1.fnal.gov fifebatch1.fnal.gov 232 1824 0 group_darkside.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_darkside.shawest@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 3 group_des.desgw@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_des.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_dzero.snow@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 1 group_fermilab.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_gm2.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_gm2.rfatemi@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_lar1nd.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_lariat.acciarri@fifebatch1.fnal.gov fifebatch1.fnal.gov 2 0 0 group_lariat.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_lbne.achatter@fifebatch1.fnal.gov fifebatch1.fnal.gov 18 0 4 group_lbne.dani83@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_lbne.dbrailsf@fifebatch1.fnal.gov fifebatch1.fnal.gov 1 0 0 group_lbne.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_lbne.skemboi@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 6 0 group_lbne.tianxc@fifebatch1.fnal.gov fifebatch1.fnal.gov 5 0 0 group_lbne.tjyang@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_lbne.trj@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minerva.Production.minervapro@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minerva.goran@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 1 group_minerva.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_minerva.laliaga@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minerva.minervapro@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minerva.norrick@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minos.ashley90@fifebatch1.fnal.gov fifebatch1.fnal.gov 197 3 0 group_minos.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_minos.rchen2@fifebatch1.fnal.gov fifebatch1.fnal.gov 34 0 0 group_mu2e.ehrlich@fifebatch1.fnal.gov fifebatch1.fnal.gov 606 495 0 group_mu2e.gandr@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_mu2e.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_mu2e.mu2epro@fifebatch1.fnal.gov fifebatch1.fnal.gov 12822 15354 928 group_mu2e.oksuzian@fifebatch1.fnal.gov fifebatch1.fnal.gov 10 784 0 group_mu2e.rhbob@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 299 group_mu2e.rlc@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 5 0 group_nova.arrieta1@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_nova.biaow@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1112 0 group_nova.brebel@fifebatch1.fnal.gov fifebatch1.fnal.gov 27 0 0 group_nova.dbox@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_nova.jvasel@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_nova.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_nova.novapro@fifebatch1.fnal.gov fifebatch1.fnal.gov 2851 31050 0 group_nova.prabhjot@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_nova.xbbu@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_numix.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_seaquest.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_uboone.jyoti@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 0 0 group_uboone.kherner@fifebatch1.fnal.gov fifebatch1.fnal.gov 0 1 0 group_annie.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_argoneut.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_cdf.adriutti@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_cdf.aluca@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 3 group_cdf.cjclarke@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_cdf.ctosciri@fifebatch2.fnal.gov fifebatch2.fnal.gov 95 0 0 group_cdf.dbox@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_cdf.dfitz11@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_cdf.kotwal@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 12 group_cdf.marchese@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_cdf.sganguly@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 6 group_cdf.vaikunth@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_cdf.vellidis@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_cdf.whiteran@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_cdf.willis@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_coupp.orin@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 9 group_darkside.aldenf@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_darkside.hqian36@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_darkside.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 5 0 group_darkside.lmarini@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_darkside.masayuki@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 4 group_darkside.pagnes28@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_des.desgw@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_des.dtucker@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_des.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_des.yanny@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_dune.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_dune.php13tkw@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_dune.tjyang@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_dune.trj@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_fermilab.arossi@fifebatch2.fnal.gov fifebatch2.fnal.gov 209 0 0 group_fermilab.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 3 0 group_fermilab.mengel@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_gm2.fienberg@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_gm2.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_gm2.liangli@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_gm2.lwelty@fifebatch2.fnal.gov fifebatch2.fnal.gov 9 0 0 group_gm2.nfroemm@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_gm2.schlesr2@fifebatch2.fnal.gov fifebatch2.fnal.gov 6 1 0 group_gm2.sweigart@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_gm2.twalton@fifebatch2.fnal.gov fifebatch2.fnal.gov 16 0 0 group_lar1.dbox@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lar1nd.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_lariat.acciarri@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.gpulliam@fifebatch2.fnal.gov fifebatch2.fnal.gov 3 0 0 group_lariat.johnnyho@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_lariat.pkryczyn@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.rbouabid@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.soubasis@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.tjyang@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lariat.wforeman@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.achatter@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 4 group_lbne.dani83@fifebatch2.fnal.gov fifebatch2.fnal.gov 2498 9082 1 group_lbne.dbrailsf@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.gvsinev@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.jpdavies@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 2 group_lbne.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.knguyen@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.lebrun@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.ljf26@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_lbne.php13tkw@fifebatch2.fnal.gov fifebatch2.fnal.gov 8 0 0 group_lbne.seturner@fifebatch2.fnal.gov fifebatch2.fnal.gov 141 0 2 group_lbne.skemboi@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 50 group_lbne.tjyang@fifebatch2.fnal.gov fifebatch2.fnal.gov 2 0 0 group_marsgm2.strigano@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_marslbne.reitzner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_marslbne.strigano@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_marsmu2e.vspron@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.Keepup.minervadat@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.Production.minervapro@fifebatch2.fnal.gov fifebatch2.fnal.gov 1095 44 1237 group_minerva.betan009@fifebatch2.fnal.gov fifebatch2.fnal.gov 297 40 0 group_minerva.drimal@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.drut1186@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.evalen@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.goran@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 4 group_minerva.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_minerva.kwiley@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.marshalc@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.mateusc@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 13 group_minerva.minervapro@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.norrick@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.nsteinbe@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.oaltinok@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.renlu23@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minerva.rgalindo@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.ashley90@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.aurisano@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.hyepesra@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_minos.kreymer@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.minospro@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.mkiveni@fifebatch2.fnal.gov fifebatch2.fnal.gov 2 0 0 group_minos.nav@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.psail@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.simon217@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_minos.wingmc@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.boyd@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 5 0 group_mu2e.brownd@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.coleman@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.ehrlich@fifebatch2.fnal.gov fifebatch2.fnal.gov 245 3261 0 group_mu2e.gandr@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.gianipez@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.hedin@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 3 group_mu2e.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 17 0 group_mu2e.mu2epro@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.murat@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_mu2e.oksuzian@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_mu2e.palladin@fifebatch2.fnal.gov fifebatch2.fnal.gov 3 8098 2 group_mu2e.rhbob@fifebatch2.fnal.gov fifebatch2.fnal.gov 1262 27234 234 group_mu2e.rlc@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 4 0 group_mu2e.roehrken@fifebatch2.fnal.gov fifebatch2.fnal.gov 140 75 0 group_mu2e.zchen@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.arrieta1@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.aurisano@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.biaow@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1112 0 group_nova.bzamoran@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.crisprin@fifebatch2.fnal.gov fifebatch2.fnal.gov 1 0 0 group_nova.dbox@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.dmendez@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.jvasel@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.kalra@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 102 0 group_nova.kmatera@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.kotelnik@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.kuldeepm@fifebatch2.fnal.gov fifebatch2.fnal.gov 147 0 0 group_nova.mylab@fifebatch2.fnal.gov fifebatch2.fnal.gov 2 0 0 group_nova.novapro@fifebatch2.fnal.gov fifebatch2.fnal.gov 90 69 0 group_nova.pavan219@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 1 group_nova.philipm1@fifebatch2.fnal.gov fifebatch2.fnal.gov 1505 1255 3 group_nova.sedayath@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.tamsett@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 22 group_nova.test.dbox@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_nova.tjyoti@fifebatch2.fnal.gov fifebatch2.fnal.gov 878 0 0 group_nova.xbbu@fifebatch2.fnal.gov fifebatch2.fnal.gov 124 0 0 group_numix.cpgrant@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_numix.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 1 0 group_seaquest.bkerns@fifebatch2.fnal.gov fifebatch2.fnal.gov 1 0 0 group_seaquest.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_seaquest.liuk@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_seaquest.production.dannowi1@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_testjobs@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_uboone.aschu@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.bcarls@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 2 group_uboone.bkirby@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.cjen@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.greenlee@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.jyoti@fifebatch2.fnal.gov fifebatch2.fnal.gov 6 4 0 group_uboone.kalousis@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.kherner@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 2 0 group_uboone.kterao@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.ran@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 group_uboone.uboonepro@fifebatch2.fnal.gov fifebatch2.fnal.gov 0 0 0 OWNER RUN IDLE HELD OLDEST_JOB acciarri 2 0 0 9/25 14:00 0+00:25:08 slicer-Acciarri_Li achatter 18 0 8 8/21 16:37 7+05:40:47 Nominal_test4_010. aldenf 232 1824 0 9/25 02:17 0+05:44:45 run_darkart_011144 aluca 3 0 4 6/19 09:10 98+01:24:46 - arossi 209 0 0 9/22 12:20 2+00:54:48 dcache-write-only. ashley90 197 3 0 9/25 13:06 0+01:10:04 submit_FeldmanIFDH bcarls 1 0 2 9/14 13:35 11+00:51:39 0 betan009 294 40 0 9/24 14:15 0+15:38:26 GaudiAnaStage_0001 biaow 0 2224 0 7/21 16:57 0+00:00:00 biaow-fn1_721-2015 bkerns 1 0 0 9/21 11:32 3+07:27:20 mc_drellyan_C_M011 boyd 0 5 0 9/16 10:04 0+00:00:00 sleep_job.sh_20150 brebel 27 0 0 9/23 13:33 2+00:32:11 brebel-AnalysisSki crisprin 1 0 0 9/25 10:03 0+04:21:31 crisprin-crisprin_ ctosciri 196 0 0 9/22 19:14 2+19:11:08 0 dani83 2499 9080 1 9/24 14:23 0+17:33:25 Run_FastMC_job.sh_ dbrailsf 1 0 0 9/16 06:07 9+07:45:36 process_request.sh desgw 1 0 1 9/24 09:29 1+04:55:24 0 dfitz11 301 0 16 9/17 22:29 7+15:57:10 0 dtucker 0 0 1 9/23 15:32 0+00:00:21 job_Nite20150722_E ehrlich 849 3755 0 9/25 11:25 0+02:23:44 copyback.sh_201509 goran 0 0 5 8/11 16:48 0+01:59:18 GaudiAnaStage_0000 gpulliam 3 0 0 9/25 11:42 0+02:40:39 reco-Test-v01_09_0 hedin 0 0 3 8/25 11:48 0+17:34:01 copyback.sh_201508 jbrodsky 0 2 0 12/19 16:18 0+00:00:00 gridrun_g4ds.sh_20 jobs; 0 0 0 0 completed, 1 suspended jpdavies 0 0 2 9/4 09:37 5+12:59:35 FullSimulation.sh_ jyoti 884 4 0 9/24 12:16 1+02:08:59 run_flugg_grid.sh_ kherner 0 153 0 6/9 05:39 0+00:00:00 submission_test.sh kotwal 180 0 38 7/6 14:48 80+20:32:31 - kuldeepm 147 0 0 9/23 17:02 1+20:42:03 g4numi_job.sh_2015 lmarini 0 0 1 8/11 13:43 0+00:46:33 run_darkart_010981 lwelty 9 0 0 9/21 22:11 3+16:13:23 submit-localreleas masayuki 0 0 4 8/29 15:48 0+06:00:03 run_od_11102_1.sh_ mateusc 0 0 13 7/20 15:54 0+01:00:05 GaudiAnaStage_0001 minervapro 3471 50 1234 9/23 18:27 1+19:59:23 0 mkiveni 2 0 0 9/24 16:31 0+21:43:00 catchup-s2s.sh_201 mu2epro 12875 15226 928 7/27 08:11 57+16:28:24 mu2eprodsys.sh_201 mylab 2 0 0 9/25 11:46 0+02:39:20 reduce_job_numu.sh novapro 2941 31119 0 6/29 11:47 0+00:00:00 novapro-offsite-pc oksuzian 10 784 1 9/24 19:55 0+06:59:31 copyback.sh_201509 orin 0 0 9 7/12 16:35 0+00:10:42 process_single_run pagnes28 0 1 0 7/8 12:53 0+00:00:00 test.sh_20150708_1 palladin 3 8098 2 9/22 16:10 0+15:20:22 copyback.sh_201509 pavan219 0 0 1 9/11 06:36 0+00:00:44 pavan219-pavan_Cos philipm1 1511 1249 3 6/24 12:00 0+11:56:28 philipm1-cosmicfor php13tkw 8 0 0 9/24 15:00 0+23:26:37 gen-MUSUN_gen-v04_ ptl 1 0 1 9/3 05:20 22+06:35:19 - rchen2 34 0 0 9/25 10:52 0+03:31:23 submit_systematics rfatemi 0 1 0 6/8 14:40 0+00:00:00 submit-localreleas rhbob 1259 27234 533 8/21 11:16 1+14:19:50 copyback.sh_201508 rlc 0 9 0 8/27 06:01 0+00:00:00 osgMonCheck.sh_201 roehrken 140 75 0 9/14 11:37 10+09:48:27 copyback.sh_201509 schlesr2 5 1 0 9/21 15:27 3+04:37:55 submit-localreleas seiya 51 0 0 9/17 10:18 8+03:57:57 0 seturner 143 0 2 9/19 14:40 5+23:46:50 0 sganguly 11 0 152 9/7 14:02 18+00:24:26 0 shawest 0 0 3 8/11 01:26 0+00:09:39 run_od_8303_47.sh_ skemboi 0 7 50 9/24 15:50 0+00:00:21 kimoi_001.sh_20150 snow 0 0 1 7/6 11:34 0+00:37:06 test02.sh_20150706 tamsett 0 0 22 7/21 05:37 0+18:41:27 caf_ana_job.sh_201 tianxc 5 0 0 9/25 12:16 0+02:08:22 Run_GEVGEN_job.sh_ tjyang 2 1 0 9/24 11:04 1+03:00:01 reco-prodgenie_nue tjyoti 878 0 0 9/24 12:16 1+02:08:59 run_flugg_grid.sh_ trj 0 1 0 9/25 12:31 0+00:00:00 bprobe2.sh_2015092 twalton 16 0 0 9/15 16:15 9+21:33:26 submit-release.sh_ vellidis 1 1 0 9/24 11:04 1+03:22:58 0 whiteran 1 0 1 7/28 16:42 58+20:21:22 - willis 1 1 0 9/25 13:51 0+00:31:43 0 xbbu 124 0 0 9/23 17:07 1+20:30:58 g4numi_job.sh_2015 TOTALS 28672 100948 3042 glidein count: 24044 currently servicing jobs, 576 unclaimed
Example: narrow the query down with --user:¶
$ jobsub_q --user boyd JOBSUBJOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 4157596.0@fifebatch2.fnal.gov boyd 09/16 10:04 0+00:00:00 I 0 0.0 sleep_job.sh_20150916_100421_18859_0_1_wrap.sh 4157596.1@fifebatch2.fnal.gov boyd 09/16 10:04 0+00:00:00 I 0 0.0 sleep_job.sh_20150916_100421_18859_0_1_wrap.sh 4157596.2@fifebatch2.fnal.gov boyd 09/16 10:04 0+00:00:00 I 0 0.0 sleep_job.sh_20150916_100421_18859_0_1_wrap.sh 4157596.3@fifebatch2.fnal.gov boyd 09/16 10:04 0+00:00:00 I 0 0.0 sleep_job.sh_20150916_100421_18859_0_1_wrap.sh 4157596.4@fifebatch2.fnal.gov boyd 09/16 10:04 0+00:00:00 I 0 0.0 sleep_job.sh_20150916_100421_18859_0_1_wrap.sh 5 jobs; 0 completed, 0 removed, 5 idle, 0 running, 0 held, 0 suspended
Example: Narrow the query down with --group:¶
$ jobsub_q --group seaquest JOBSUBJOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 4255478.0@fifebatch2.fnal.gov bkerns 09/21 11:32 3+07:22:31 R 0 293.0 mc_drellyan_C_M011_S012_0203.sh_20150921_113250_30170_0_1_wrap.sh 1 jobs; 0 completed, 0 removed, 0 idle, 1 running, 0 held, 0 suspended
Example: Querying the state of a previously submitted job:¶
On the jobsub_submit page, there is an example of the user submitting a job with jobid 269.0@fifebatch2.fnal.gov. Is it running?
$ jobsub_q -G nova --jobid 269.0@fifebatch2.fnal.gov --debug JOBSUBJOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 269.0@fifebatch2.fnal.gov dbox 06/09 14:47 0+00:00:00 I 0 0.0 nova.sh_20140609_144728_12037_0_1_wrap.sh JOBSUB SERVER CONTACTED : https://fifebatch.fnal.gov:8443 JOBSUB SERVER RESPONDED : https://fifebatch2.fnal.gov:8443 JOBSUB SERVER RESPONSE CODE : 200 (Success) JOBSUB SERVER SERVICED IN : 1.4852669239 sec JOBSUB CLIENT FQDN : fermicloud326.fnal.gov JOBSUB CLIENT SERVICED TIME : 21/Apr/2015 14:39:04
Its still idle
Example: Determining who submitted using a group account (and from where)¶
Sometimes a job is submitted using a group account, for example the 'novapro' account for Nova Production jobs. Users with access to such accounts may be curious as to who submitted the jobs.
Lets submit a test job from this account for illustration purposes:
[dbox@fermicloud326 client]$ jobsub_submit -G nova --role Production --jobsub-server https://fifebatch.fnal.gov:8443 file://simple_worker_script.sh 10 /fife/local/scratch/uploads/nova/novapro/2015-04-21_143903.633613_8139 /fife/local/scratch/uploads/nova/novapro/2015-04-21_143903.633613_8139/@simple_worker_script.sh_20150421_143904_7820_0_1.cmd submitting.... Submitting job(s). 1 job(s) submitted to cluster 1508538. JobsubJobId of first job: 1508538.0@fifebatch1.fnal.gov Use job id 1508538.0@fifebatch1.fnal.gov to retrieve output
Verify the job is in the queue under user 'novapro' using jobsub_q:
[dbox@fermicloud326jobsub_q -G nova --jobsub-server https://fifebatch.fnal.gov:8443 --jobid 1508538.0@fifebatch1.fnal.gov --debug JOBSUBJOBID USER SUBMITTED RUN_TIME ST PRI SIZE CMD 1508538.0@fifebatch1.fnal.gov novapro 04/21 14:39 0+00:00:00 I 0 0.0 @simple_worker_script.sh_20150421_143904_7820_0_1_wrap.sh JOBSUB SERVER CONTACTED : https://fifebatch.fnal.gov:8443 JOBSUB SERVER RESPONDED : https://fifebatch1.fnal.gov:8443 JOBSUB SERVER RESPONSE CODE : 200 (Success) JOBSUB SERVER SERVICED IN : 2.27557110786 sec JOBSUB CLIENT FQDN : fermicloud326.fnal.gov JOBSUB CLIENT SERVICED TIME : 21/Apr/2015 14:40:37 [dbox@fermicloud326 client]$
Use the --long option with grep to see the DN of the user that submitted and the IP of the machine he submitted from:
$jobsub_q -G nova --jobsub-server https://fifebatch.fnal.gov:8443 --jobid 1508538.0@fifebatch1.fnal.gov --long | grep -i '^JOBSUB' JobsubClientDN = "/DC=gov/DC=fnal/O=Fermilab/OU=People/CN=Dennis D. Box/CN=UID:dbox" JobsubServerVersion = "1.1.1.1" JobsubClientIpAddress = "131.225.154.204" $
Example: using --better-analyze to get more information on why a job hasn't started¶
$ jobsub_q --user trj JOBSUBJOBID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD 4352957.0@fifebatch2.fnal.gov trj 09/25 12:31 0+00:00:00 I 0 0.0 bprobe2.sh_20150925_123158_30683_0_1_wrap.sh 1 jobs; 0 completed, 0 removed, 1 idle, 0 running, 0 held, 0 suspended $ jobsub_q --better-analyze --jobid 4352957.0@fifebatch2.fnal.gov -- Schedd: fifebatch2.fnal.gov : <131.225.67.139:9615?addrs=131.225.67.139-9615&noUDP&sock=13611_1434> User priority for trj@fifebatch2.fnal.gov is not available, attempting to analyze without it. --- 4352957.000: Run analysis summary. Of 25104 machines, 20445 are rejected by your job's requirements 4659 reject your job because of their own requirements 0 match and are already running your jobs 0 match but are serving other users 0 are available to run your job No successful match recorded. Last failed match: Fri Sep 25 14:49:01 2015 Reason for last match failure: no match found The Requirements expression for your job is: ( ( ( Arch == "X86_64" ) || ( Arch == "INTEL" ) ) && ( target.IS_Glidein == true ) && ( DesiredOS is NULL || stringlistimember(Target.IFOS_installed,DesiredOS) ) && ( stringListsIntersect(toUpper(target.HAS_usage_model),toUpper(my.DESIRED_usage_model)) ) ) && ( TARGET.OpSys == "LINUX" ) && ( TARGET.Disk >= RequestDisk ) && ( TARGET.Memory >= RequestMemory ) && ( TARGET.Cpus >= RequestCpus ) && ( TARGET.HasFileTransfer ) Your job defines the following attributes: DESIRED_usage_model = "OPPORTUNISTIC" DesiredOS = "SL6" JobUniverse = 5 RequestCpus = 1 RequestDisk = 35000000 RequestMemory = 2000 The Requirements expression for your job reduces to these conditions: Slots Step Matched Condition ----- -------- --------- [0] 25104 Arch == "X86_64" [3] 25104 target.IS_Glidein == true [6] 25104 stringlistimember(Target.IFOS_installed,DesiredOS) [9] 4930 stringListsIntersect(toUpper(target.HAS_usage_model),toUpper(my.DESIRED_usage_model)) [13] 24552 TARGET.Disk >= RequestDisk [14] 4659 [9] && [13] Suggestions: Condition Machines Matched Suggestion --------- ---------------- ---------- 1 ( ( Arch == "X86_64" ) || ( Arch == "INTEL" ) ) 0 REMOVE 2 ( stringListsIntersect(toUpper(target.HAS_usage_model),"OPPORTUNISTIC") ) 4930 3 ( TARGET.Disk >= 35000000 ) 24552 4 ( TARGET.Memory >= 2000 ) 24987 5 ( target.IS_Glidein == true ) 25104 6 ( "SL6" is NULL || stringlistimember(Target.IFOS_installed,"SL6") ) 25104 7 ( TARGET.OpSys == "LINUX" ) 25104 8 ( TARGET.Cpus >= 1 ) 25104 9 ( TARGET.HasFileTransfer ) 25104 -- Schedd: fifebatch1.fnal.gov : <131.225.67.102:9615?addrs=131.225.67.102-9615&noUDP&sock=27353_cc3f> $