Draft list of modules needed in job submission scripts


All of these need to handle MC, Data production, Analysis

Need to work both for users and for official production

All of these need better error handling

new CVS package for the new submission scripts

Support multiple input -> single output

Use sam metadata to track parentage, create output metadata


Collect the dag logfiles, condor logfiles and gaudi logfiles in one place

Connect to a jobs database which can map sam projects, condor job numbers and our submissions - makes it much easier to find logs and track outcomes.

Modules that we can identify so far

  1. top level script that user calls (replaces ProcessXX)
  2. generic jobs that run on remote machine - as example - are there preamble/appendix hooks needed for dags?
  3. Object which defines a processing step - example would be reco on a file
      <input files definition>  <input-tier> <output-tier> <release> <opts-template> <calibration options> <io options> <kludge options>
  4. Implementation of processing chain using dag - example would be MC --cal --minos --reco --ana
    - is the single processing step a trivial dag?
    - can we define standard chains and allow adding new ones?
  5. Job options options - reads command lines, creates a dictionary that our scripts know how to use - use ana_scripts are an example - migrate to using jobsub type arguments where possible
  6. single filename parser/creator script to create and understand filenames/paths.
    Use the < >< > format that fts uses? Borrow the parser?
  7. Gaudi options file modifier script that does the mods in general, instead of hard coding in each script
  8. json metadata creator - MergeMeta as example?
  9. logfile parser/stripper - count the important things, remove identified junk messages, "we're looking at you GEANT"
  10. playlist definitions - must interface to samweb - design document here runperioddefiner - Heidi and Jeremy