Project

General

Profile

Output-file handling (2.01.00 and newer)

RootOutput configuration

All RootOutput modules can be configured to switch to a new output file when one or more criteria are met. The following configuration parameters may be specified1.

1 The specific set of configuration parameters depends on which version of art is used. For a full list of RootOutput configuration parameters supported for your particular version, type art --print-description RootOutput.
 

art versions 2.02.00 and newer

The file-switching behavior is specified in a fileProperties block in the RootOutput configuration table. The following configuration is supported:

out : {

   module_type : RootOutput

   fileName : "myFile.root" 

   # ...

   fileProperties: {

      maxEvents: <unsigned>  # default is unbounded
      maxSubRuns: <unsigned>  # default is unbounded
      maxRuns: <unsigned>  # default is unbounded
      maxInputFiles: <unsigned>  # default is unbounded

      # Maximum size of file (in KiB)
      maxSize: <unsigned>  # default is unbounded

      # Maximum age of output file (in seconds)
      maxAge: <unsigned>  # default is unbounded

      granularity: [Event | SubRun | Run | InputFile | Job]  # default is "Event" 
   }
}

The granularity parameter specifies the level at which an output file may close. By default, the granularity is set to Event, which is the finest granularity available. If users would like to ensure that only full SubRuns or full Runs (as determined by the input file) should be written to an output file, the appropriate granularity is SubRun or Run, respectively. The max* parameters can be specified simultaneously. See below for some examples.

# FHiCL file snippet
...
physics.e1: [o1, o2, o3, o4]

outputs : {

   o1 : { # Only one output file for entire process
      module_type : RootOutput
      fileName : "out.root" 
   }

   o2 : { # Switch to new output file at each Run boundary
      module_type : RootOutput
      fileName : "out_r%R.root" 
      fileProperties: {
         maxRuns: 1
         granularity: Run
      }
   }

   o3 : { # Switch to new output file at the next SubRun after (at least) 
          # 1000 events have been written to the output file
      module_type : RootOutput
      fileName : "out_%#.root" 
      fileProperties: {
         maxEvents: 1000
         granularity: SubRun
      }
   }

   o4 : { # Switch to new output file if a new input file has been reached OR
          # 1000 events have been written to the output file.
      module_type : RootOutput
      fileName : "out_%#.root" 
      fileProperties: {
         maxEvents: 1000
         maxInputFiles: 1
      }
}

Note that specifying a granularity other than Event is primarily useful for delaying output file-rollover until a new object of the specified granularity has been reached.


Beta configuration: art versions 2.01.00, 2.01.01, and 2.01.02

maxEventsPerFile

This parameter specifies the maximum number of events written to an output file, not the number of events processed when writing to the output file. If the SelectEvents clause is included in the output-module configuration, the event counting will be performed only over events that satisfy the SelectEvents criterion.

maxFileSize

The maxFileSize parameter specifies the maximum size (in kB) that an open output file may have at the time its size is queried. Whenever the file-size is queried, its value represents all quantities that have been compressed and written to the file. The reported value does not (indeed, cannot) include the contributions from various art metadata that are not written to the file until just before file close; nor does the reported value include the contributions from TTree branches whose basket buffers have not yet been flushed to disk. For that reason, if an output module requests a file switch because the maximum size of the file has been reached, the size of the closed file will exceed the specified maxFileSize value. It is therefore incumbent on the user to account for this behavior.

maxAge

The maxAge parameter specifies the maximum age (in seconds) that an output file may be open before switching to a new output file.

fileSwitch.boundary

This parameter specifies the level at which file-switching may take place. Allowed boundary values are Event, SubRun, Run, InputFile and Unset. The default setting is Unset, which implies that an output module never switches to a new file. If a user specifies a value of (e.g.) SubRun, then if the output module requests the opportunity to close, that request will not be granted until the next SubRun is reached. This allows the user to control the degree to which events, sub-runs, and runs are contained in a given output file.

fileSwitch.force

By default, the value of this parameter is false. If set to true, the output module is forced to switch output files at the next instance of the boundary specified. For example, if a user specifies:

fileSwitch : {
   boundary : InputFile
   force : true
}

then the output module will switch to a new output file whenever the current input file is closed, and a new one is opened. It is a configuration error to specify 'force: true' if the boundary has a value of Unset.

Example

In the sample configuration file below, three output modules are configured to switch output files independently of each other.

# FHiCL file snippet
...
physics.e1: [o1, o2, o3]

outputs : {

   o1 : {  # Same behavior as before
      module_type : RootOutput
      fileName : "out.root" 
   }

   o2 : { # Switch to new output file at each Run boundary
      module_type : RootOutput
      fileName : "out_r%R.root" 
      fileSwitch : {
         force : true
         boundary : Run
      }
   }

   o3 : { # Switch to new output file at the next SubRun after (at least) 
          # 1000 events have been written to the file
      module_type : RootOutput
      fileName : "out_%#.root" 
      maxEventsPerFile : 1000
      fileSwitch : {
         boundary : SubRun
      }
   }
}

For output module o1, no output-file switching has been specified, so the process will create one output file. For o2, whenever a new Run is reached, the output file switches, closing the first output file according to the pattern "out_r%R.root". The last output module is the most complicated one: since the maxEventsPerFile value is true, and the fileSwitch.boundary value is SubRun, the output module will switch to a new output file after 1000 events have been written to the file and once the next SubRun begins.


Output-file handling and (Sub)Run products

Such flexibility, as described above, introduces output-file switching at potentially arbitrary times, splitting up runs and sub-runs into fragments. To be able to accommodate such fragmentation, and for the user to be able to interpret the Run and SubRun products correctly, infrastructure has been put in place that ties each product to the appropriate set of events or sub-runs. This is discussed here.