Pubs and

The pubs script that interacts with is called, and is located in subdirectory dstream_prod of the pubs product.

Pubs status codes uses a large number of status codes to manage multistage projects, which are executed by on offline computing resources.
Pubs status codes are two-digit numbers, xy, where the first digit x represents the stage, and the second digit y represents the processing status within a stage. Obviously, this convention implies a maximum of ten stages and ten states within stages. So far, we haven't exeeded these limits.

Stage status codes

Value Symbolic name Description
0 kDONE Stage is complete
1 kINITIATED Stage is started (ready to submit batch job)
3 kSUBMITTED Batch job has been submitted
4 kRUNNING Batch job is running
5 kFINISHED Batch job completed (ready to check output)
6 kTOBERECOVERED Check failed (ready to submit recovery job)
7 kREADYFORSAM Check succeeded (ready to declare to sam)
8 kDECLARED Declared to sam (ready to upload to enstore)
9 kSTORED Successfully copied to FTS dropbox (location not verified)

The above values and symbolic names are hard coded in

In normal production, stage statuses advance in sequence: 1, 3, 4, 5, 7, 8, 10, 11 (next stage), ....

MC chain stages.

Value Name Description
0 gen Generator
10 g4 Geant4
20 detsim Detector simulation
30 reco1 Stage 1 reconstruction
40 reco2 Stage 2 reconstruction
50 mergeana Merge + analysis

Stage names and values are defined as resources in the project configuration.

Special pubs statuses

There are some special pubs statuses that do not conform to the above two-digit convention.

Value Name Description
100 Dead end No further processing needed for this (run, subrun, seq, version). Not an error.
>1000 Error Too many resubmissions for this (run, subrun, seq, version). logic

On each invocation of for a given project, the main function (process) has four nested loops over the following indices:

  1. Status within stage (high to low).
  2. Stage (high to low).
  3. Run.
  4. subrun.

According to this logic, the special statuses (dead end and error) are never reached, and so no further processing happens for them. For each regular status, queries what (run, subrun) pairs are at this status through the pubs database api, loops over the resulting runs and subruns, and does an appropriate action according to the status. The appropriate action will often involve an invocation of via the python interface (import project). The invocation of is done in such a way that, were it done through the command line interface, it would be equivalent to invoking with command line option "--pubs <run> <subrun> <version>".

A recently added capability of and is the ability to process multiple subruns in a single "action" (e.g. a single invocation of Currently, the only action that has a working multiple-subrun interface is job submission. The multiple-subrun job submission interface works for both file list and sam dataset input. pubs mode.

When is invoked with option "--pubs <run> <subrun> <version>" (or equivalent via python interface), one says that is being invoked in pubs mode. Pubs mode modifies the behavior of, as compared to a stand alone project, both with respect to where gets input files, and where stores output files.

Multiple subruns can be specified on the command line interface by specifying multiple subruns or subruns ranges, separated by commas and hyphens, with no embedded spaces.

Effect of pubs mode on output files.

Pubs mode affects where stores output files by adding subdirectories <version>/<run>/<subrun> to the output directory, log directory, and work directory specified in the project xml file.

Effect of pubs mode on input files.

Pubs mode affects where expects to find input files, depending on how input is specified in the xml file.

  • If a stage has no input (e.g. generator stage), pubs mode has no effect on input.
  • In case input is being daisy-chained from disk from a previous stage, pubs mode causes to construct a reduced input file list, assuming the previous stage has a pubs directory structure.
  • In case input is from sam, pubs mode will cause to generate a new sam dataset definition with additional run and subrun constraints.
  • Pubs input mode is not supported, and does not make sense, for single-file input.
  • Pubs input mode can be defeated by specifying "<pubsinput>0</pubsinput>" in the xml file for a stage. Then will get input files from wherever it would have got them if it were not using pubs mode, e.g. a file list or a sam dataset.