Project

General

Profile

InitialWorkflowSetup

So lets say you're starting up the hypothetical new "hypot" experiment, and you want to
configure POMS to start doing your MonteCarlo production.

You want to generate Montecarlo events, simulate them going through the detector, and then run your reconstruction software. In the hypot experiment software, you can do this interactively in a few
simple steps:

fake_eventgen.exe xxx.fcl      # generates event root files into gen_xxx.root
fake_sim.exe  gen_xxx.root     # simulates the generated events into sim_xxx.root
fake_reco.exe sim_xxx.root     # reconstruct the simulated data into reco_xxx.root

However as each of these phases takes significant time, and we want to generate lots of montecarlo
data, we need to run multiple batches, with each stage in its own jobs. Secondaraily, we have
multiple different event mixes we want to run (a different fcl file for the eventgen phase).
In particular, we need to do some with hypot_initial_set1.fcl, some with hypot_initial_set2.fcl
and some with hypot_initial_set3.fcl

We'll assume that the executables above also generate metadata for SAM, which we can extract and declare.

Things we'll need to configure:

We're going to build in POMS a "Campaign" of three "Campaign Stages" in a chain, and furthermore we're going to have the first stage work through the list of set1, set2, and set3 tags to put into our generation name.
Furthermore, we're going to configure 2 JobTypes that will handle our initial generation phase and declare our output files to SAM, and a second type that processes an input dataset and generates output files which are also declared to SAM, and a piece that tells POMS where we login and start jobs. In the current version of POMS we need to build these from the bottom up.

Job launch configs

We're going to use the fife_launch/fife_wrap utility from the fife_utils package to do our job launches
We need basically two cases: one that runs without an input dataset, but declares our output files for POMS,
and another that does the consuming-a-dataset method.

gen.cfg

[global]
group      = hypot
experiment = hypot
release    = v0_1
pass       = part1
wrapper = file:///${FIFE_UTILS_DIR}/libexec/fife_wrap

[env_pass]
IFDH_DEBUG = 1
VERSION=%(release)s

[submit]
G          = %(group)s
N          = 5
resource-provides      = usage_model=OPPORTUNISTIC,DEDICATED,OFFSITE
generate-email-summary = True
expected-lifetime      = 12h
memory                 = 2000MB

[job_setup]
find_setups = True
source_1    = /cvmfs/hypot.opensciencegrid.org/software/setup_hypot.sh
setup_1     = hypot %(release)s

[executable]
name        = fake_eventgen.exe
arg_1       = conf_initial_%(pass).fcl

[job_output]
addoutput   = gen*.root*
rename      = unique
dest        = /pnfs/hypot/scratch/workarea
# declare metadata and locations so next phase can run
declare_metadata   = True
metadata_extractor = json
add_to_dataset     = _poms_task
add_locations      = True

process.cfg

[global]
group      = hypot
experiment = hypot
release    = v0_1
sam_dataset = test_reco
wrapper = file:///${FIFE_UTILS_DIR}/libexec/fife_wrap

[env_pass]
IFDH_DEBUG = 1
VERSION=%(release)s

[submit]
G          = %(group)s
dataset    = %(sam_dataset)s
N          = 5
resource-provides      = usage_model=OPPORTUNISTIC,DEDICATED,OFFSITE
generate-email-summary = True
expected-lifetime      = 12h
memory                 = 4000MB

[job_setup]
find_setups = True
setup_1     = ifdhc
multifile   = True

[sam_consumer]
limit       = 1
appvers        = %(release)s

[executable]
name        = fake_reco.exe 

[job_output]
addoutput   = gen*.root*
rename      = unique
dest        = /pnfs/hypot/scratch/workarea
declare_metadata   = True
metadata_extractor = json
add_locations      = True

Login and setup

We first login to POMS, and choose from the left menu Configure Launch Templates.

We fill in the:

  • hostname (hypotgpvm01.fnal.gov)
  • user (hypotpro) and
  • setup info:
    • setting X509_USER_PROXY to our production managed proxy
    • setting up fife_utils to get access to the fife_launch script.

We'll be able to use this for all of our job launching needs.

Generation job type

We pick Configure Job Type from the left menu, and hit Add.

This will be a job type using our gen.cfg file above.

  • a name: Generation
  • what our output files look like (with database % wildcards): gen%.root
  • our launch script fife_launch
  • Then we add some parameters for our launch script
    • -c /path/to/our/gen.cfg
    • -Oglobal.release= %(version)s

This last option lets us pass in the version set in our Campaign Stages to the job.

Process job type

We pick Configure Job Type from the left menu, and hit Add.

This will be our general file processing job type, using our process.cfg file

  • a name: Generation
  • what our output files look like (with database % wildcards): gen%.root
  • our launch script fife_launch
  • Then we add some parameters for our launch script
    • -c /path/to/our/gen.cfg
    • -Oglobal.release= %(version)s

This last option lets us pass in the version set in our Campaign Stages to the job.

Generation Campaign Stage

This will be the first stage

  • Name MC_Generation
  • VO Role (for jobsub commands)
  • State : Active
  • Version
  • Dataset: paramset1,paramset2,paramset3
  • Dataset split type: list
  • Completion Type: Completed
  • Completion % 95
  • Job Type: Generation
  • Launch Template: our only one
  • parameter overrides:
    • -Oglobal.param= %(dataset)s

This last one takes a bit of explanation; basically we are going to use POMS's dataset splitter to instead work through our list of parameter sets.

Now we're going to save this stage, and re-edit it and set

  • Dependencies: add ourselves, so we loop through the list of "datasets" we're using for parameter levels.

simulation stage

We can start by cloning (clone icon) our Generation stage and making a few changes:

  • Name MC_Simulation
  • JobType -- Processing
  • Split type None
  • dataset: FromParent
  • parameter overrides
    • -Oexecutable.name fake_sim.exe
  • dependencies
    • depends on MC_Gen

simulation stage

We can start by cloning (clone icon) our Generation stage and making a few changes:

  • JobType -- Processing
  • Split type None
  • dataset: FromParent
  • parameter overrides
    • -Oexecutable.name fake_reco.exe
  • dependencies
    • depends on MC_Sim