Nova Production Infrastructure¶
Production Infrastructure (Pre-Jan 2013)¶
Current NOvA production infrastructure and scripts that are being used for NOvA production are documented under the main NOvA offline wiki: Production Infrastructure and Scripts
The old Monte Carlo production model was broken into different stages with very different scripts and submission methods being used at each level.
In particular the MC generation scripts were designed to generate all the n-way permutations of generation parameters and then generate a specific ".fcl" configuration file for each analysis executable that was going to be run. This was done by submitting a "job script" which ran locally on a worker node and would generate a second job script and ".fcl" file on that node. As a result the scripts were not relocatable because they depended on having access to a suite of additional libraries and scripts (which were tied to the bluearc central disk). This method also had the limitation of creating "independent" jobs which makes it difficult the perform bookkeeping on which jobs are running or failing.
Other stages of the production used
Required Infrastructure Changes¶
The following changes to the infrastructure were determined through a series of meetings with the NOvA production group, other members of NOvA involved in batch processing and with experts in batch processing/scripting/grid running.
Script Review Findings (7FEB2013)¶
The NOvA production scripts were reviewed to identify what the base units of were that had to be accomplished. In the process of this review it was determined that the following model described the production process.
- There is a defined set of inputs that map to tasks that must be run (Monte Carlo generation requests or files to perform processing/analysis on)
- The tasks can be logically associated into 1 or more subgroups with some commonality (i.e. all input files will be processed with the same set of offline modules, or there are 10000 runs which will need one type of Monte Carlo and 5000 that will need a different MC config....)
- All of the critical configuration parameters can be pre-determined
- There are a small subset of configuration parameters that should be determined or overridden by the run time environment (Monte Carlo run numbers, random number seeds, unique file identifier, etc...)
- There are no
Production Infrastructure (Post Changes)¶