Project

General

Profile

Fife launch Reference

fife_launch is a job launcher; a config-file based front-end to jobsub_submit.

It is invoked with at least

  • a -c config-file option, and
  • zero or more -Osection.var=value override flags, which can override any setting in the config file.
  • optionally a --stage something flag to pick an overrides stanza in the config.

The config options are grouped into sections, whose options will be discussed below. A minimal config must have at least
the following sections:

  • global
  • executable
  • submit

You may want to see hints on specific topics like:

variable expansion

If you use environment variables in your config file, they can be expanded one of three places:
  • at launch time, on the jobsub_submit commandline -- $VARIABLE
  • in the job when fife_wrap is invoked -- \$VARIABLE
  • within the script or dataset declaration -- \\\$VARIABLE
  • in a file copy-out etc. -- \\\\\\\$VARIABLE (yes that is 7 backslashes)

You need the latter form if you want to use, for example, an environment variable that comes from a setup action, prescript, or postscript.

Note that if you are specifying this in an override on the command line, this moves up another layer of escaping,
and to get the last layer, you need (shudder) 15 backslashes; i.e.

-Oexecutable.arg_6=output_of_\\\\\\\\\\\\\\\${fname}

Note that (as of v3_3 and later) for environment variables that fife_wrap puts in the multifile loop, like fname for filename furi for file uri and nthfile for the number of input files, have %(_fname)s %(_furi)s %(_nthfile) shorthands in the config which expand to \\\\\\\${fname} etc.

global section

  • includes = file1 file2 ... -- other config files to incude and merge with this one
  • group = jobsub-group-name -- group name for jobsub_submit, etc.
  • xxx = yyy -- macro for replacement anywhere in file

env_pass section

  • xxx = yyy -- specific environment variable and value to pass in with -e option to jobsub.

[You can actually put the -e arguments in the submit section, but this is more readable.]
While there are too many environment variables to list that you might want to send, particularly when debugging new jobs, we reccommend;

IFDH_DEBUG=1
IFDH_CP_MAXRETRIES=3

executable section

Primary command your job is going to run.

  • name = executable-name -- primary executable to run
  • arg_1 = first-argument
  • arg_2 = second-argument
  • ...

executable_1, executable_2, ... sections

Allows you to specify further executables to run -- i.e. if you want to run a simulation and reco in the same job

  • name = executable-name -- nth executable to run
  • arg_1 = first-argument
  • arg_2 = second-argument
  • ...

job_output section

  • outputlist = filename -- "ifdh cp -f " file for copying back results
  • addoutput = pattern -- pattern to give to "ifdh addOutputFile"
  • dest = path -- destination directory for copyBackOutput
  • hash = n -- number of filename-hash subdirectories to make in dest
  • rename = how -- argument for "ifdh renameOutput"
  • declare_metadata = True-- declare metadata for output files to SAM
  • metadata_extractor = command -- add a metadata extractor command
  • filter_metadata x,y,z -- comma separated list of metadata fields to drop before declaring (i.e. to remove fields the metadata extractor found)
  • add_metadata x=y -- add additional metadata variables to declared metadata
  • add_to_dataset name -- add a metadata dataset.tag value to files, and declare as a dataset. A special shorthand "_poms_task" can be given for the name, which will be replaced with poms_depends_${POMS_TASK_ID}_1 for the input dataset for a dependant stage.
  • dataset_exclude = glob -- exclude files matching glob from dataset
  • add_location = True -- add a location for file after copying it to dest

You can specify add_metadata_1 add_metadata_2, etc. to give multiple metadata values

job_output_1, job_output_2,... sections

Same as above, allowing different file types to be handled separately.

job_setup section

These are all flags passed to the fife_wrap wrapper script in your job.

  • inputfile = path-to-file -- file to copy into job
  • inputtar = path-to-tarfile -- tarfile to copy into job and unpack
  • export = name=value -- environment variable to set
  • setup = setup-args -- ups setup command to run
  • setup_local = true -- setup packages in INPUT_TAR are (see --tar_file_name in jobsub_submit to get it there)
  • source = path-to-file -- (bash) script to source
  • prescript = path-to-file -- script to run before main executable (runs in a separate subshell)
  • postscript = path-to-file -- script to run after main executable (runs in a separate subshell)
  • mix = path-to-file -- script to run on input files (using ${fname} in the environment) before running executable
  • with_gdb = True -- run executable(s) under gdb and print a stack trace

In addition you can specify one of the following flags that specify multiple-file/SAM consumer behavior

  • ifdh_art = True -- use an ifdh_art mainline -- start SAM consumer, executable will do getNextFile loop internally
  • multifile = True -- use the multifile mainline -- Start Sam consumer run getNextFile loop with executable run per file with filename on end of executable line
  • getconfig = true -- like multifile but the file is given with -c filename as a config file

You can append a config item with _0, _1, _2.. to specify multiple options of --setup or --source
i.e.

[submit]
setup_1 = package2 -q quals
setup_2 = package3

sam_consumer section

Specify options for the establishConsumer call if using ifdh_art/multifile/getconfig options above

  • limit = n -- limit number of files to send this consumer
  • schema = x -- file delivery schema(s) preferred {root (streaming),gsiftp,..}
  • appname = name -- application name to associate with consumer
  • appfamily = name -- application family "
  • appvers = version -- application version "

submit section

Any arguments to jobsub_submit (minus the dashes)

  • group = groupname -- for --group=groupname
  • resource_provides = -- usually: usage_model=OFFSITE,OPPORTUNISTIC,DEDICATED
  • site = sitelist -- list sites to restrict to
  • expected-lifetime = -- expected run time of job
  • disk = -- disk required
  • memory = -- available memory required
  • role = -- VOMS ROle for priorities/accounting
  • N = -- number of jobs
  • lines = -- line to append to condor submit file
  • dataset = -- Submit SAM start/end project DAG on this dataset

Run "jobsub_submit --group=group --help" to get a full list.

But also:

  • n_files_per_job = -- divide count of files in dataset by this to get -N value (new in v3_2_3)

You can append a flag-name with _0, _1, _2.. to specify multiple options of --lines or -f, etc.
i.e.

[submit]
f_0 = file1
f_1 = file2
f_2 = file3

stage_xxx sections

In the stage_xxx sections you can specify overrides to the other parameters that will take effect if --stage xxx is passed on the command line, i.e.

  • section.var = value

will override the value for var in section [section] For example, if you wanted to make a stage "middle" that was different in that it passed a different config file, you could have:

[executable]
exe = fred
arg_1 = -c
arg_2 = usual.cfg

[stage_middle]
executable.arg_2 = different.cfg

poms_campaign_layer section

Deprecated. This is used if you're launching jobs yourself but reporting them to POMS, and is buggy in several releases.

Params to poms_client.register_poms_campaign:

  • campaign_name = name -- name of the campaign in POMS (will be created if needed)
  • experiment = name -- experiment name
  • version = xxx -- software version, should correspond to application version in SAM consumers
  • dataset = overall_dataset -- dataset campaign will process
  • user = -- username to associate with campaign
  • campaign_definition = -- job type for POMS
  • test = -- True for submitting to POMS test instance instead.

poms_get_task_id section

Deprecated. This is used if you're launching jobs yourself but reporting them to POMS, and is buggy in several releases.

Params to poms_client.get_task_id_for(). campaign will default to the one returned
by register_poms_campaign() if you specified a [poms_campaign_layer] section, otherwise
explicitly set a campaign.

  • campaign = name_or_id -- name or id number of campaign
  • experiment=name -- experiment name of task -- limits search for campaign, job-type
  • user = username -- user to associate with submission
  • command_executed = command_line -- defaults to jobsub_submit line built by fife_launch
  • input_dataset = name -- dataset to list as input dataset
  • parent_task_id = id -- task id for parent, defaults to $POMS_PARENT_TASK_ID
  • test = True -- set to True to register with test instance, not production