Project

General

Profile

User guide » History » Version 31

Version 30 (Herbert Greenlee, 01/12/2015 03:55 PM) → Version 31/86 (Herbert Greenlee, 01/12/2015 04:00 PM)

{{toc}}

h1. Overview

Larsoft common batch and workflow tools are contained in ups product @larbatch@ (this redmine), which is built and distributed as part of @larsoft@. Larbatch tools are built on top of Fermilab @jobsub_client@ batch submission tools. For general information about jobsub_client and the Fermilab batch system, refer to articles on the "jobsub wiki":https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki and the "fife wiki":https://cdcvs.fnal.gov/redmine/projects/fife/wiki/Getting_Started_on_GPCF.

No other part of larsoft is dependent on @larbatch@, and @larbatch@ is not setup as a dependent of the @larsoft@ umbrella ups product. Rather, @larbatch@ is intended to be a dependent of experiment-specific ups products (see [[admin_guide|this article]] for instructions on configuring @larbatch@ for a specific experiment.

After setting up ups product @larbatch@, several executable scripts and python modules are available on the execution path and python path. Here is a list of the more important ones.

* "project.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/project.py
An executable python script that is the the main entry point for user interation. More information can be found below.

* "project_utilities.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/python/project_utilities.py
A python module, imported by @project.py@, that implements some of the workflow functionality. End users would not normally interact directly with this module. However, a significant aspect of @project_utilities.py@ is that is supplies hooks for providing experiment-specific implementations of some functionality, as described in an [[admin_guide#Experiment-specific hooks|accompanying article]] on this wiki.

* "condor_lar.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_lar.sh
The main batch script. @Condor_lar.sh@ is a general purpose script that manages a single invocation of an art framework program (@lar@ executable). @Condor_lar.sh@ sets up the run-time environment, fetches input data, interacts with sam, and copies output data. It is not intended that end users will directly invoke @condor_lar.sh@. However, one can get a general idea of the features and capabilities of @condor_lar.sh@ by viewing the built-in documentation by typing "@condor_lar.sh -h@, or reading the file header.

* "condor_start_project.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_start_project.sh
Batch script for starting a sam project.

* "condor_stop_project.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_stop_project.sh
Batch script for stopping a sam project.

h1. Using @project.py@

@Project.py@ is used in conjunction with a xml format project definition file (see [[user_guide#Project File Structure|below]]). The concept of a project, as understood by @project.py@, and as defined by the project definition file, is a multistage linear processing chain involving a specified number of batch workers at each stage.

h2. Use cases

In a typical invocation of @project.py@, one specifies the project file (via option @--xml@), tha stage name (via option @--stage@), and one or more action options. Here are some use cases for invoking @project.py@.

* @project.py -h@ or @project.py --help@
Print built-in help (lists all available command line options).

* @project.py -xh@ or @project.py --xmlhelp@
Print built-in xml help (lists all available elements that can be included in project definition file).

* @project.py --xml xml-name --status@
Print global summary status of the project.

* @project.py --xml xml-name --stage stage-name --submit@
Submit batch jobs for specified stage.

* @project.py --xml xml-name --stage stage-name --check@
Check results from specified stage (identifies failed jobs). This action assumes that the art program produces an artroot output file.

* @project.py --xml xml-name --stage stage-name --checkana@
Check results from specified stage (identifies failed jobs). This version of the check action skips some checks done by @--check@ that only make sense if the art program produces an artroot output file. Use this action to check results from an analyzer-only art program.

* @project.py --xml xml-name --stage stage-name --makeup@
Submit makeup jobs for failed jobs, as identified by a previous @--check@ or @--checkana@ action.

* @project.py --xml xml-name --stage stage-name --clean@
Delete output for the specified stage and later stages. This option can be combined with @--submit@.

* @project.py --xml xml-name --stage stage-name --declare@
Declare successful artroot files to sam.

* @project.py --xml xml-name --stage stage-name --upload@
Upload successful artroot files to enstore.

* @project.py --xml xml-name --stage stage-name --define@
Create sam dataset definition.

* @project.py --xml xml-name --stage stage-name --audit@
Check the completeness and correctness of a processing stage using sam parentage information. For this action to work, input and output files must be must be declared to sam.

h1. Project File Structure

The general structure of the project file is that it contains global options, a single project element (enclosed in "@<project name=project-name>...</project>@"), and one or more processing stage elements (enclosed in "@<stage name=stage-name>...</stage>@" inside the project element.

h2. Examples (Microboone)

"Example project files in ubutil":https://cdcvs.fnal.gov/redmine/projects/ubutil/repository/revisions/master/show/xml/mcc5.0

h2. Internal documentation

Refer to header of "project.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/project.py or type @"project.py --xmlhelp"@.

h2. XML header section

The initial lines of an XML project file should follow a standard pattern. Here is a typical example header.

<pre>
<?xml version="1.0"?>
<!DOCTYPE project [
<!ENTITY release "v02_05_01">
<!ENTITY file_type "mc">
<!ENTITY run_type "physics">
<!ENTITY name "prod_eminus_0.1-2.0GeV_isotropic_uboone">
<!ENTITY tag "mcc5.0">
]>
</pre>

The significance of the header elements are as follows.

* The XML version
The XML version is always the first line. Copy the above version line exactly (namely, "@<?xml version=\"1.0\"?>@"). version="1.0"?>@").

* The document type (DOCTYPE keyword).
The argument following the DOCTYPE keyword specifies the "root element" of the XML file, and should always be "project."

* Entity definitions
Entity definitions, which occur inside the DOCTYPE block, are XML's version of aliases. Any string that occurs repeatedly inside an XML file can be defined as an entity, and referred to inside the the remainder of the XML file by enclosing the entity name inside &...; (e.g.&release;).

The first line should always

h2. Global options

h2. Project Block

h2. Stage Blocks