Project

General

Profile

User guide » History » Version 41

Herbert Greenlee, 01/12/2015 04:38 PM

1 1 Herbert Greenlee
{{toc}}
2 1 Herbert Greenlee
3 1 Herbert Greenlee
h1. Overview
4 1 Herbert Greenlee
5 4 Herbert Greenlee
Larsoft common batch and workflow tools are contained in ups product @larbatch@ (this redmine), which is built and distributed as part of @larsoft@.  Larbatch tools are built on top of Fermilab @jobsub_client@ batch submission tools.  For general information about jobsub_client and the Fermilab batch system, refer to articles on the "jobsub wiki":https://cdcvs.fnal.gov/redmine/projects/jobsub/wiki and the "fife wiki":https://cdcvs.fnal.gov/redmine/projects/fife/wiki/Getting_Started_on_GPCF.
6 3 Herbert Greenlee
7 3 Herbert Greenlee
No other part of larsoft is dependent on @larbatch@, and @larbatch@ is not setup as a dependent of the @larsoft@ umbrella ups product.  Rather, @larbatch@ is intended to be a dependent of experiment-specific ups products (see [[admin_guide|this article]] for instructions on configuring @larbatch@ for a specific experiment.
8 5 Herbert Greenlee
9 9 Herbert Greenlee
After setting up ups product @larbatch@, several executable scripts and python modules are available on the execution path and python path.  Here is a list of the more important ones.
10 9 Herbert Greenlee
11 11 Herbert Greenlee
* "project.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/project.py
12 10 Herbert Greenlee
An executable python script that is the the main entry point for user interation.  More information can be found below.  
13 9 Herbert Greenlee
14 11 Herbert Greenlee
* "project_utilities.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/python/project_utilities.py
15 10 Herbert Greenlee
A python module, imported by @project.py@, that implements some of the workflow functionality.  End users would not normally interact directly with this module.  However, a significant aspect of @project_utilities.py@ is that is supplies hooks for providing experiment-specific implementations of some functionality, as described in an [[admin_guide#Experiment-specific hooks|accompanying article]] on this wiki.
16 10 Herbert Greenlee
17 11 Herbert Greenlee
* "condor_lar.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_lar.sh
18 10 Herbert Greenlee
The main batch script.  @Condor_lar.sh@ is a general purpose script that manages a single invocation of an art framework program (@lar@ executable).  @Condor_lar.sh@ sets up the run-time environment, fetches input data, interacts with sam, and copies output data.  It is not intended that end users will directly invoke @condor_lar.sh@.  However, one can get a general idea of the features and capabilities of @condor_lar.sh@ by viewing the built-in documentation by typing "@condor_lar.sh -h@, or reading the file header. 
19 12 Herbert Greenlee
20 12 Herbert Greenlee
* "condor_start_project.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_start_project.sh
21 12 Herbert Greenlee
Batch script for starting a sam project.
22 12 Herbert Greenlee
23 12 Herbert Greenlee
* "condor_stop_project.sh":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/condor_stop_project.sh
24 12 Herbert Greenlee
Batch script for stopping a sam project.
25 13 Herbert Greenlee
26 17 Herbert Greenlee
h1. Using @project.py@
27 13 Herbert Greenlee
28 26 Herbert Greenlee
@Project.py@ is used in conjunction with a xml format project definition file (see [[user_guide#Project File Structure|below]]).  The concept of a project, as understood by @project.py@, and as defined by the project definition file, is a multistage linear processing chain involving a specified number of batch workers at each stage.
29 13 Herbert Greenlee
30 22 Herbert Greenlee
h2. Use cases
31 22 Herbert Greenlee
32 27 Herbert Greenlee
In a typical invocation of @project.py@, one specifies the project file (via option @--xml@), tha stage name (via option @--stage@), and one or more action options.  Here are some use cases for invoking @project.py@.
33 13 Herbert Greenlee
34 13 Herbert Greenlee
* @project.py -h@ or @project.py --help@
35 13 Herbert Greenlee
Print built-in help (lists all available command line options).
36 13 Herbert Greenlee
37 1 Herbert Greenlee
* @project.py -xh@ or @project.py --xmlhelp@
38 28 Herbert Greenlee
Print built-in xml help (lists all available elements that can be included in project definition file).
39 14 Herbert Greenlee
40 29 Herbert Greenlee
* @project.py --xml xml-name --status@
41 14 Herbert Greenlee
Print global summary status of the project.
42 14 Herbert Greenlee
43 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --submit@
44 14 Herbert Greenlee
Submit batch jobs for specified stage.
45 14 Herbert Greenlee
46 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --check@
47 15 Herbert Greenlee
Check results from specified stage (identifies failed jobs).  This action assumes that the art program produces an artroot output file.  
48 14 Herbert Greenlee
49 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --checkana@
50 15 Herbert Greenlee
Check results from specified stage (identifies failed jobs).  This version of the check action skips some checks done by @--check@ that only make sense if the art program produces an artroot output file.  Use this action to check results from an analyzer-only art program.
51 14 Herbert Greenlee
52 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --makeup@
53 14 Herbert Greenlee
Submit makeup jobs for failed jobs, as identified by a previous @--check@ or @--checkana@ action.
54 14 Herbert Greenlee
55 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --clean@
56 14 Herbert Greenlee
Delete output for the specified stage and later stages.  This option can be combined with @--submit@.
57 14 Herbert Greenlee
58 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --declare@
59 14 Herbert Greenlee
Declare successful artroot files to sam.
60 14 Herbert Greenlee
61 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --upload@
62 14 Herbert Greenlee
Upload successful artroot files to enstore.
63 14 Herbert Greenlee
64 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --define@
65 1 Herbert Greenlee
Create sam dataset definition.
66 1 Herbert Greenlee
67 29 Herbert Greenlee
* @project.py --xml xml-name --stage stage-name --audit@
68 1 Herbert Greenlee
Check the completeness and correctness of a processing stage using sam parentage information.  For this action to work, input and output files must be must be declared to sam.
69 1 Herbert Greenlee
70 1 Herbert Greenlee
h1. Project File Structure
71 26 Herbert Greenlee
72 41 Herbert Greenlee
The general structure of the project file is that it contains a single root element of type "@project@" (enclosed in "@<project name=project-name>...</project>@").  Inside the project element, there are additional subelements, including one or moe stage subelements (enclosed in "@<stage name=stage-name>...</stage>@."  Each stage element defines a group of batch jobs that are submitted together by a single invocation of @jobsub_submit@.
73 15 Herbert Greenlee
74 39 Herbert Greenlee
h2. Examples
75 16 Herbert Greenlee
76 39 Herbert Greenlee
Example XML project files used by microboone from ubutil product can be found "here.":https://cdcvs.fnal.gov/redmine/projects/ubutil/repository/revisions/master/show/xml/mcc5.0
77 1 Herbert Greenlee
78 21 Herbert Greenlee
h2. Internal documentation
79 21 Herbert Greenlee
80 21 Herbert Greenlee
Refer to header of "project.py":https://cdcvs.fnal.gov/redmine/projects/larbatch/repository/revisions/develop/entry/scripts/project.py or type @"project.py --xmlhelp"@.
81 21 Herbert Greenlee
82 23 Herbert Greenlee
h2. XML header section
83 1 Herbert Greenlee
84 23 Herbert Greenlee
The initial lines of an XML project file should follow a standard pattern.  Here is a typical example header.
85 1 Herbert Greenlee
86 23 Herbert Greenlee
<pre>
87 23 Herbert Greenlee
<?xml version="1.0"?>
88 23 Herbert Greenlee
<!DOCTYPE project [
89 23 Herbert Greenlee
<!ENTITY release "v02_05_01">
90 23 Herbert Greenlee
<!ENTITY file_type "mc">
91 23 Herbert Greenlee
<!ENTITY run_type "physics">
92 23 Herbert Greenlee
<!ENTITY name "prod_eminus_0.1-2.0GeV_isotropic_uboone">
93 23 Herbert Greenlee
<!ENTITY tag "mcc5.0">
94 23 Herbert Greenlee
]>
95 23 Herbert Greenlee
</pre>
96 23 Herbert Greenlee
97 23 Herbert Greenlee
The significance of the header elements are as follows.
98 23 Herbert Greenlee
99 23 Herbert Greenlee
* The XML version
100 33 Herbert Greenlee
Copy the above version line exactly, namely,
101 32 Herbert Greenlee
<pre>
102 32 Herbert Greenlee
<?xml version="1.0"?>
103 32 Herbert Greenlee
</pre>
104 23 Herbert Greenlee
105 24 Herbert Greenlee
* The document type (DOCTYPE keyword).
106 34 Herbert Greenlee
The argument following the DOCTYPE keyword specifies the "root element" of the XML file, and should always be "@project@."
107 24 Herbert Greenlee
108 24 Herbert Greenlee
* Entity definitions
109 40 Herbert Greenlee
Entity definitions, which occur inside the DOCTYPE section, are XML aliases.  Any string that occurs repeatedly inside an XML file is a candidate for being defined as an entity.  Entities can be substituted inside the the body of the XML file by enclosing the entity name inside @&...;@ (e.g. @&release;@).
110 1 Herbert Greenlee
111 41 Herbert Greenlee
h2. Project Element
112 1 Herbert Greenlee
113 41 Herbert Greenlee
Each project definition file should contain a single project element enclosed in "@<project name=project-name>...</project>@."  The name attribute of the project element is required.
114 1 Herbert Greenlee
115 41 Herbert Greenlee
The content of the project element consists of other XML subelements, including the following.
116 41 Herbert Greenlee
* A single subelement with tag "@larsoft@," which defines the run-time environment.
117 41 Herbert Greenlee
* Option subelements.
118 41 Herbert Greenlee
* One or more stage subelements. 
119 1 Herbert Greenlee
120 41 Herbert Greenlee
h3. Larsoft subelement.
121 41 Herbert Greenlee
122 41 Herbert Greenlee
Each project element is required to contain a single subelement with tag "@larsoft@" (enclosed in "@<larsoft>...</larsoft>@."  The larsoft subelement defines the batch run-time environment.
123 41 Herbert Greenlee
124 41 Herbert Greenlee
125 41 Herbert Greenlee
126 41 Herbert Greenlee
127 41 Herbert Greenlee
h2. Stage Element