Project

General

Profile

fife_launch / fife_wrap

Overview

fife_launch is a config-file based job launch script, which calls jobsub_submit and works with fife_wrap to provide an experiment-independent job launching facility to turn scripts and executables into jobs which can work with POMS and SAM just by adding a few lines to the config file. fife_wrap is a replacement for the art_sam_wrap.sh script, and is run in the job.

For more details, please see the fife_launch Reference.

Why fife_wrap?

If you have an experiment framework executable, like an Art or Larsoft based executable; it doesn't have everything you need to run an efficient job on the Grid. It can, with the right plugins, participate in a SAM getNextFile loop, but it needs some setup first. And once it completes, you need to get your output files declared to SAM and dropped off somewhere. So fife_wrap provides all the parts you need to turn your experiment executable into useful grid job:

  • specifying what to source and what to setup to find your software
  • running pre- and post- scripts to setup for or cleanup after your executable
  • specifying SAM consumer parameters like application version, number of files, schema, etc.
  • specifying how you'll be running your executable
    • it runs its own SAM getNextFile loop internally
    • fife_wrap should run a getNextFile loop and
      • run the exectable once per file with the file as input
      • run the executable once per file with the file as a config (.fcl) file
  • specifying how to handle output files
    • renaming files to be unique
    • copying them to a destination directory
    • declaring the files to SAM, with an optional metadata extractor
    • whether the it should declare the destination location for the file as well
    • whether it should add the file to a tag-based dataset

Simple example

Lets say we want to run a simple batch job that will do a "printenv" so we can see what the environment looks like when we're running in a batch job. We can configure such a job with a config file for fife_launch like:

[global]
group   = mygroup
wrapper = file:///${FIFE_UTILS_DIR}/libexec/fife_wrap

[env_pass]
MYVAR = 1

[submit]
G                      = %(group)s
resource-provides      = usage_model=OFFSITE,OPPORTUNISTIC,DEDICATED
generate-email-summary = True
expected-lifetime      = 2h
disk                   = 100MB
memory                 = 500MB

[job_setup]

[sam_consumer]

[executable]
name        = /usr/bin/printenv

This is a pretty straightforward Python-ish .ini file. In the [global] section, we specify a group name which we can use anywhere in the rest of the config file, and the wrapper script we want to use, which should almost always be our own fife_wrap wrapper. Next, in the [env_pass] section we specify environment variables to pass into the job, and then in the [submit] section we give jobsub_submit flags we would like to use (note the use of the group variable we defined in the [global] section). Next in the [executable] section we specify the command we actually want to run.

If we put this in a file called 'my.cfg' we can then run:

$ setup fife_utils
$ fife_launch -c my.cfg

And it will roll up a jobsub_submit command line, print it, and run it... and it fails.
Why? "mygroup" is not a valid group for jobsub_submit. So we could edit the file
and change that, or we can just override it on the command line:

$ setup fife_utils
$ fife_launch -c my.cfg -O global.group=$EXPERIMENT

... and now it works. We can override any config file option on the command line, and in particular
if we override items in the "global" section, that handles any place it gets used with a %(name)s
in the config file.

Now we could have, pretty much just as easily, made a shell script to run jobsub_submit, as we used to document for art_sam_wrap.sh however, these scripts have historically been difficult to maintain, as they invariably end up needing interesting shell quotes etc. as they get more complicated.

A More Useful Example

Now lets suppose we have a set of config files for our experiment executable which will
generate events for future simulation and reconstruction. Using our end-user-SAM scripts,
we can create a dataset for them as described in UserGuide, and
now build a config file to run our executable over those config files, and track it in POMS.

[global]
group       = mygroup
experiment  = %(group)s
release     = v0_1
wrapper     = file:///${FIFE_UTILS_DIR}/libexec/fife_wrap
sam_dataset = my_config_dataset

[env_pass]
IFDH_DEBUG = 1

[submit]
G          = %(group)s
N          = 5
dataset    = %(sam_dataset)s
resource-provides      = usage_model=OPPORTUNISTIC,DEDICATED
generate-email-summary = True
expected-lifetime      = 10h
disk                   = 8000MB
memory                 = 2000MB

[job_setup]
find_setups = True
setup_1     = ifdhc
source_1    = /cvmfs/%(experiment)s.opensciencegrid.org/externals/setup
setup_2     = %(experiment) %(release)s
getconfig   = True

[sam_consumer]
limit       = 1
appvers        = %(release)s

[executable]
name        = %(experiment).exe 

[job_output]
addoutput   = *.root*
dest        = gsiftp://fermicloud045.fnal.gov/home/samdevpro/FTS/dropbox

Now this config file is a bit longer, but has only a few new bits over the last one.

The next interesting bit here is the getconfig=True under [job_setup] and the
[sam_consumer] section, which have our script run a SAM getNextFile loop and run
for each config file in the dataset it gets delivered. Flags you can specify under
[job_setup] to get SAM integration are:

  • getconfig = True -- specify that the executable should be run in a getNextFile loop with -c filename for each file deliviered
  • multifile = True -- specify that the executable should be run ina a getNextFile loop with each delivered file as its last parameter
  • ifdh_art = True -- specify that the script will use the ifdh_art file catalog modules to loop through the SAM delivered files internally in one invocation of the executable.

The [job_output] section has the wrapper use ifdh addOutputFiles and ifdh copyBackOutput to
send our output files back at the end of the job.

For more details, please see the fife_launch Reference.