Project

General

Profile

Running Jobs

We show a job script here, and how to run it below that. Please don't work from your ~username (/afs/fnal.gov/files/home/roomN/username) directory. You will quickly run out of space. Make yourself a directory called /nova/app/users/yourusername. Now go follow the wiki commands to setup your SRT environment at that directory. Check the wiki gotchas link. Then come back here.

The Job Configuration Script

Once a base and test release are set up, it is easy to run a job. The basic unit for running a job is the job-control script, written in the FHICL language. If you use emacs as your editor you may be interested in Syntax highlighting for emacs.

Key Concepts for FHICL

There are a few key concepts to writing a FHICL job control script. In order, they are

  1. Including previously defined configurations for services and modules from other files. This is done using #include statements. Be sure you don't have any trailing space or tab characters on the #include line.
  2. Services block, denoted by services: { } This block will contain configurations for ART specific services such as the TFileService and the RandomNumberGenerator. It also contains a user: {} sub-block where NOvA specific services are configured. For now, the user: {} sub-block is left empty in the main services block in order to make use of the syntax to include external configurations from the #include files.
  3. Source block, denoted by source: { }. This block tells the job what kind of source to expect (EmptyEvent in the case of MonteCarlo generation, RootInput in the case of anything downstream of a Monte Carlo generator or reconstruction), the file name for the input source if appropriate, and how many events to process. Both the file name and number of events to process can be specified on the command line.
  4. Outputs block, denoted by outputs: { } This block tells the job what kind of output to make, ie RootOutput, and what the name of the output file should be. The output file name can be specified on the command line. It is possible to define more than one output file if one wanted to run a job that produced different output files based on filter criteria - ie empty events are put in one file and events with neutrinos in them are put in another. Multiple output files that can only be specified in the job configuration file, not from the command line.
  5. Physics block, denoted by physics: { } This block is where all producer, analyzer, and filter modules are configured. The sequence of producer and filter modules to run is defined in a user-named path in this block. The list of analyzers to run is defined in separate user-named path. The block also defines two key-word parameters, trigger_paths and end_paths. trigger_paths contains all producer and filter paths to run, and end_paths contains the analyzer paths and output streams.

Comments may be included in FHICL configuration files using the "#" character. The "#include" is a key-word so that the parser knows not to ignore what comes after "#include".

Conventions For FHICL Files Names.

There are two types of FHICL files for the module and job. The names of the FHICLfiles
should be consistent with name of the source code. An example of the proper use of naming scheme is presented in the Demo package

Demo.fcl - the same base name as the module
demojob.fcl - for the job file. Use all lower case.

FHICL rules

There are a couple of rules to keep in mind about FHICL:

  • The value of the process_name parameter may not contain underscores as the process name is used in the ROOT file branch name. Module labels (the string after physics.producers. or physics.analyzers.) may not contain underscores either, for the same reason.
  • Parameter set names and parameter names may not contain numbers, periods, backslashes, stars, etc. They may contain underscores.
  • Put the values for all string parameters in " "
  • Specify input vectors using [ , , ], ie if you want a vector of doubles do
    MyVector: [1.0, 3e-9, -900.]
  • You pick out configurations from the #include files using the @local:: syntax. The value after the "::" is the name of the configuration specified in one of the #include files
  • You can override the value of an included configuration. For example, imagine there is a configuration specified in a #include file called mymoduleconfig and it contains the value -5 for the parameter named myint. I can load the configuration and then change the value of myint by doing the following
    physics.producer.mymod: @local::mymoduleconfig
    physics.producer.mymod.myint: 1
    

    The last value for a parameter always wins. If I had the second line another time with the value 2 instead of 1, my job would run with myint as 2. That also means that command line options always win vs parameters in the fcl file.

Configuring the message service

Several standard configurations for the message service are in source:Utilities/messageservice.fcl. There is one configuration for each level of message output - Debug, Info, Warning, and Error. These configurations will be applied to each message level that is specified and those of higher priority. For example, the Info configuration will print out Info, Warning and Error level messages while the Warning configuration only print outs Warning and Error level messages. The standard configurations will cause the messages to go to a specified output file, debug.log, info.log, warnings.log and error.log. If you want to define your own configuration, please take a look at the comments in the source:Utilities/messageservice.fcl file to determine how to do so. More information is also available from the MessageFacility wiki NB the MessageFacility wiki is a bit behind the current incarnation of MessageFacility that we are using.

Examples of how to include the use of the message service configurations are in the example files below.

Example job script: prodgenie.fcl

An example jobs script is source:EventGenerator/prodgenie.fcl. The job defined by this script will generate neutrino interactions using GENIE, run them through Geant4, do the photon transport and then simulate the electronics.

Comments on the form of the file are included as ###### Commment ######

###### This is how to include configurations from other files ######
#the following lines include standard configurations for various modules and services
#include "job/services.fcl" 
#include "job/genie.fcl" 
#include "job/g4gen.fcl" 
#include "job/simplereadout.fcl" 
#include "job/simpletransport.fcl" 

#give your job a name to be stored in the output root file containing the event
process_name: Genie

services:
{
  # Load the service that manages root files for histograms.
  TFileService: { fileName: "genie_hist.root" closeFileFast: false } #this is the service that stores any output sanity check histograms and ntuples
  Timing:       {}
  RandomNumberGenerator: {} #ART native random number generator
  message:      @local::standard_out
  user:         @local::ndos_2db_services
}

###### source is where you get events from - can also be RootInput ######
#Start each new event with an empty event.
source:
{
  module_type: EmptyEvent
  maxEvents:  10        # Number of events to create
}

# Define and configure some modules to do work on each event.
# First modules are defined; they are scheduled later.
# Modules are grouped by type. 
physics:
{

 producers:
 {
   generator: @local::genie_simpleflux_ndos
   geantgen:  @local::standard_geant4
   photrans:  @local::standard_photrans
   daq:       @local::standard_rsim
 }

 filters:{} #no filters defined for this job

 analyzers:{} #no analyzers defined for this job

 #list the modules for this path, order matters, filters reject all following items
 simulate: [ generator, geantgen, photrans, daq ] 
 stream1:  [ out1 ] #could have multiple paths

 #trigger_paths is a keyword and contains the paths that modify the art::event, 
 #ie filters and producers 
 trigger_paths: [simulate] 

 #end_paths is a keyword and contains the paths that do not modify the art::Event, 
 #ie analyzers and output streams.  these all run simultaneously
 end_paths:     [stream1]  
}

#block to define where the output goes.  if you defined a filter in the physics
#block and put it in the trigger_paths then you need to put a SelectEvents: {SelectEvents: [XXX]}
#entry in the output stream you want those to go to, where XXX is the label of the filter module(s)outputs:
{
 out1:
  {
   module_type: RootOutput
   fileName:    "genie_gen.root" #default file name, can override from command line with -o or --output
  }
}

Notice that you have not specified which libraries to load anywhere. That is because the SRT build compiles the module and service .so files against the .so's they depend upon.

Example job script: standard_reco.fcl

An example job script for reconstruction is available in source:Utilities/standard_reco.fcl:

#include "job/services.fcl" 
#include "job/CalHit.fcl" 
#include "job/Slicer.fcl" 
#include "job/MakeClusterSS.fcl" 
#include "job/AnaClusterSS.fcl" 
#include "job/MakePlaneClusters.fcl" 

process_name: Reco

services:
{
  # Load the service that manages root files for histograms.
  TFileService: { fileName: "reco_hist.root" }
  scheduler:    { wantTracer: true wantSummary: true }
  Timing:       {}
  RandomNumberGenerator: {} #ART native random number generator
  message:      @local::standard_out
  user:         @local::reco_services
}

#source is a root file
source:
{
  module_type: RootInput
  maxEvents:  10        # Number of events to create
}

# Define and configure some modules to do work on each event.
# First modules are defined; they are scheduled later.
# Modules are grouped by type.
physics:
{

 producers:
 {
   calhit:        @local::standard_calhit
   slicer:        @local::standard_slicer
   planecluster:  @local::standard_planecluster
   clusterss:     @local::standard_makeclusterss
 }

 filters:{}

 analyzers:
 {
   anaclusterss:  @local::standard_anaclusterss
 }

 #define the path for producer and filter modules, order matters, 
 #filters reject all following items.  see lines starting physics.producers below
 reco: [ calhit, slicer, planecluster, clusterss ] 

 #define the path for analyzer modules, order does not matter.  see lines starting
 #physics.analyzers below
 ana:  [ anaclusterss ]

 #define the output stream, there could be more than one if using filters 
 stream1:  [ out1 ]

 #trigger_paths is a keyword and contains the paths that modify the art::event, 
 #ie filters and producers
 trigger_paths: [reco] 

 #end_paths is a keyword and contains the paths that do not modify the art::Event, 
 #ie analyzers and output streams.  these all run simultaneously
 end_paths:     [ana, stream1]  
}

#block to define where the output goes.  if you defined a filter in the physics
#block and put it in the trigger_paths then you need to put a SelectEvents: {SelectEvents: [XXX]}
#entry in the output stream you want those to go to, where XXX is the label of the filter module(s)outputs:
outputs:
{
 out1:
 {
   module_type: RootOutput
   fileName:    "standard_reco.root" #default file name, can override from command line with -o or --output
 }
}

Example configuration file: geometry.fcl

An example of a file with predefined configurations for a service is in the source:trunk/Geometry/geometry.fcl file:

###### All files that are parameter set definitions must contain BEGIN_PROLOG as their first line ######
###### This tag tells the FHICL parser that parameter set definitions are coming                  ######
BEGIN_PROLOG

ndos_geo: 
{
 ROOT: "Geometry/gdml/ndos.root" 
 GDML: "Geometry/gdml/ndos.gdml" 
 BigBoxUsed: false
 BigBoxRange: 1500
}

nd_geo: 
{
 ROOT: "Geometry/gdml/neardet.root" 
 GDML: "Geometry/gdml/neardet.gdml" 
 BigBoxUsed: false
 BigBoxRange: 1500
}

fd_geo: 
{
 ROOT: "Geometry/gdml/fardet.root" 
 GDML: "Geometry/gdml/fardet.gdml" 
 BigBoxUsed: false
 BigBoxRange: 7500
}

###### All files that are parameter set definitions must contain END_PROLOG as their last line ######
###### This tag tells the FHICL parser that parameter set definitions are ended                ######
END_PROLOG

How to override a default parameter

If you want to override a default parameter that has been included from a predefined parameter set, you must specify which parameter and its value as

mainBlock.subBlock.label.parameterName: newValue

where

  • mainBlock can be services or physics
  • subBlock can be user, producers, filters, or analyzers
  • label is the name of the desired service or module in a producers, filters, or analyzers block
  • parameterName is the name of the desired parameter
  • newValue is the desired new value

These lines must go after the mainBlock and be outside of any other mainBlocks.

For example, if one wanted to change the default value of the fhitsModuleLabel parameter in the DBcluster module in the previous section, one would put

physics.producers.cluster.fhitsModuleLabel: "differentHitModuleLabel" 

Executable and command line options

Currently there is one executable to run in NOvASoft. The executable to run a typical reconstruction or analysis job is nova which is placed in the user's path by the setup script. To see what options are available do

$nova -h

The output is

nova <options> [config-file]:
-T [ --TFileName ] arg File name for TFileService.
-c [ --config ] arg Configuration file.
-e [ --estart ] arg Event # of first event to process.
-h [ --help ] produce help message
-n [ --nevts ] arg Number of events to process.
--nskip arg Number of events to skip.
-o [ --output ] arg Event output stream file.
-s [ --source ] arg Source data file (multiple OK).
-S [ --source-list ] arg file containing a list of source files to read, one
per line.
--trace Activate tracing.
--notrace Deactivate tracing.
--memcheck Activate monitoring of memory use.
--nomemcheck Deactivate monitoring of memory use.

Running a Job

To run the job defined by the script above, do

$ nova -c job/prodgenie.fcl

Or, to do it so you can close the lid on your laptop and have your job keep running do it like this

$ nohup nova -c job/prodgenie.fcl >& pg.out

You can also run in a debug mode which will simply print out the configuration of the job by doing

$ ART_DEBUG_CONFIG=1 nova -c job/prodgenie.fcl

in bash, or

> env ART_DEBUG_CONFIG=1 nova -c job/prodgenie.fcl

which produces the output

** ART_DEBUG_CONFIG is defined: config debug output follows **
all_modules:["out1","geantgen","geniegen","photrans","rsim"] outputs:{out1:{fileName:"genie_gen.root" module_label:"out1" module_type:"RootOutput"}} physics:{end_paths:["stream1"] producers:{geantgen:{DetectorBigBoxRange:1500 G4CheckRockVeto:"true" G4EnergyThreshold:1e-7 G4MacroPath:"g4nova/g4nova.mac" GenModuleLabel:"generator" IsBigBoxUsed:"true" module_label:"geantgen" module_type:"G4Gen"} geniegen:{BeamCenter:[2.5e-1,0,0] BeamDirection:[0,0,1] BeamName:"numi" BeamRadius:3 DebugFlags:0 DetectorLocation:"NOvA-ND" Environment:["GSPLOAD","/grid/fermiapp/nova/novasrt/data/gxspl-NUMI-R2.6.0.xml","GPRODMODE","YES","GEVGL","Default"] EventsPerSpill:1 FluxFile:"/nova/data/flux/gsimple/simpleflx_nova-nd_flugg.root" FluxType:"simple_flux" GenFlavors:[12,14,-12,-14] GlobalTimeOffset:10000 MixerBaseline:0 MixerConfig:"none" MonoEnergy:2 POTPerSpill:5e13 PassEmptySpills:"true" RandomSeed:0 RandomTimeOffset:10000 SurroundingMass:0 TargetA:12 TopVolume:"vDetEnclosure" ZCutOff:0 module_label:"geniegen" module_type:"GENIEGen"} photrans:{ApplyFiberSmearing:"true" AttenPars:[3.137e-1,2.895e2,1.669e-1,8.523e2] BirksConstant:9e-2 CollectionEff:8.23e-8 EmissionTau:9 FiberIndexOfRefraction:1.59 FilterModuleLabel:"geantgen" MessageLevel:0 NoAttenNoPoissonMode:"false" PhotonsPerMeV:1.2e9 QuantumEff:8.5e-1 SmearingHistoFile:"PhotonTransport/Dt_per_z_distribution.root" Step:2 TimeClustering:2 TimeSpread:2 module_label:"photrans" module_type:"SimpleTransport"} rsim:{ADCMaxPE:2800 APDExcessNoiseFactor:2.5 ASICBaselineInADCCounts:500 ASICFallTime_Far:2000 ASICFallTime_Near:500 ASICRiseTime_Far:250 ASICRiseTime_Near:60 Clocktick:6.25e1 DetectorCapacitance:10 EmptyCellNoiseFile_FD:"ReadoutSim/emptycellnoise_fd.root" EmptyCellNoiseFile_ND:"ReadoutSim/emptycellnoise_nd.root" FPGAAlgorithm:"DualCorrelatedSampling" FPGA_DualCorrelatedSampling_ThresholdADC_Far:2.05e1 FPGA_DualCorrelatedSampling_ThresholdADC_Near:2.25e1 FPGA_MatchedFiltering_BaselineTime_Far:-1500 FPGA_MatchedFiltering_BaselineTime_Near:-500 FPGA_MatchedFiltering_EndTime_Far:10000 FPGA_MatchedFiltering_EndTime_Near:2500 FPGA_MatchedFiltering_ThresholdMatchValue_Far:1.87e1 FPGA_MatchedFiltering_ThresholdMatchValue_Near:1.87e1 Gain:100 GlobalTimeOffset:0 LeakageCurrent:1e-1 NumClockticksInSpill:500 NumNormSamples:100000 OversampleFactor:10 PhotonModuleLabel:"photrans" RandomTimeOffset:0 VoltageNoiseDensity:1.5 module_label:"rsim" module_type:"SimpleReadout"}} simulate:["geniegen","geantgen","photrans","rsim"] stream1:["out1"] trigger_paths:["simulate"]} process_name:"Genie" services:{RandomNumberGenerator:{} TFileService:{closeFileFast:false fileName:"genie_hist.root"} Timing:{} user:{Geometry:{BigBoxRange:1500 BigBoxUsed:"false" GDML:"Geometry/gdml/neardet.gdml" ROOT:"Geometry/gdml/neardet.root"} RandomHandler:{}}} source:{maxEvents:10 module_label:"source" module_type:"EmptyEvent"} trigger_paths:{trigger_paths:["simulate"]}

If you are happy with the job simply invoke:

$ nova -c job/prodgenie.fcl

without the preamble which set the environment variable in the previous invocation.

Why did my job fail?

Tried running a job according to the instructions above and got nothing or a seg fault? Look at the screen output to see if you can track down the reason. You can also take a look at the Trouble Shooting and Gotchas page for known common problems. If you ever need to report a bug to the list, you should attach the entire output to the email.