- Table of contents
- Output-file handling (2.01.00 and newer)
Output-file handling (2.01.00 and newer)¶
RootOutput
configuration¶
All RootOutput
modules can be configured to switch to a new output file when one or more criteria are met. The following configuration parameters may be specified1.
1 The specific set of configuration parameters depends on which version of art
is used. For a full list of RootOutput
configuration parameters supported for your particular version, type art --print-description RootOutput
.
art
versions 2.02.00 and newer¶
The file-switching behavior is specified in a fileProperties
block in the RootOutput
configuration table. The following configuration is supported:
out : {
module_type : RootOutput
fileName : "myFile.root"
# ...
fileProperties: {
maxEvents: <unsigned> # default is unbounded
maxSubRuns: <unsigned> # default is unbounded
maxRuns: <unsigned> # default is unbounded
maxInputFiles: <unsigned> # default is unbounded
# Maximum size of file (in KiB)
maxSize: <unsigned> # default is unbounded
# Maximum age of output file (in seconds)
maxAge: <unsigned> # default is unbounded
granularity: [Event | SubRun | Run | InputFile | Job] # default is "Event"
}
}
The granularity
parameter specifies the level at which an output file may close. By default, the granularity is set to Event
, which is the finest granularity available. If users would like to ensure that only full SubRun
s or full Run
s (as determined by the input file) should be written to an output file, the appropriate granularity is SubRun
or Run
, respectively. The max*
parameters can be specified simultaneously. See below for some examples.
# FHiCL file snippet
...
physics.e1: [o1, o2, o3, o4]
outputs : {
o1 : { # Only one output file for entire process
module_type : RootOutput
fileName : "out.root"
}
o2 : { # Switch to new output file at each Run boundary
module_type : RootOutput
fileName : "out_r%R.root"
fileProperties: {
maxRuns: 1
granularity: Run
}
}
o3 : { # Switch to new output file at the next SubRun after (at least)
# 1000 events have been written to the output file
module_type : RootOutput
fileName : "out_%#.root"
fileProperties: {
maxEvents: 1000
granularity: SubRun
}
}
o4 : { # Switch to new output file if a new input file has been reached OR
# 1000 events have been written to the output file.
module_type : RootOutput
fileName : "out_%#.root"
fileProperties: {
maxEvents: 1000
maxInputFiles: 1
}
}
Note that specifying a granularity other than Event
is primarily useful for delaying output file-rollover until a new object of the specified granularity has been reached.
Beta configuration: art
versions 2.01.00, 2.01.01, and 2.01.02¶
maxEventsPerFile
¶
This parameter specifies the maximum number of events written to an output file, not the number of events processed when writing to the output file. If the SelectEvents
clause is included in the output-module configuration, the event counting will be performed only over events that satisfy the SelectEvents
criterion.
maxFileSize¶
The maxFileSize
parameter specifies the maximum size (in kB) that an open output file may have at the time its size is queried. Whenever the file-size is queried, its value represents all quantities that have been compressed and written to the file. The reported value does not (indeed, cannot) include the contributions from various art
metadata that are not written to the file until just before file close; nor does the reported value include the contributions from TTree
branches whose basket buffers have not yet been flushed to disk. For that reason, if an output module requests a file switch because the maximum size of the file has been reached, the size of the closed file will exceed the specified maxFileSize
value. It is therefore incumbent on the user to account for this behavior.
maxAge¶
The maxAge
parameter specifies the maximum age (in seconds) that an output file may be open before switching to a new output file.
fileSwitch.boundary
¶
This parameter specifies the level at which file-switching may take place. Allowed boundary values are Event
, SubRun
, Run
, InputFile
and Unset
. The default setting is Unset
, which implies that an output module never switches to a new file. If a user specifies a value of (e.g.) SubRun
, then if the output module requests the opportunity to close, that request will not be granted until the next SubRun
is reached. This allows the user to control the degree to which events, sub-runs, and runs are contained in a given output file.
fileSwitch.force
¶
By default, the value of this parameter is false
. If set to true
, the output module is forced to switch output files at the next instance of the boundary specified. For example, if a user specifies:
fileSwitch : {
boundary : InputFile
force : true
}
then the output module will switch to a new output file whenever the current input file is closed, and a new one is opened. It is a configuration error to specify 'force: true'
if the boundary has a value of Unset
.
Example¶
In the sample configuration file below, three output modules are configured to switch output files independently of each other.
# FHiCL file snippet
...
physics.e1: [o1, o2, o3]
outputs : {
o1 : { # Same behavior as before
module_type : RootOutput
fileName : "out.root"
}
o2 : { # Switch to new output file at each Run boundary
module_type : RootOutput
fileName : "out_r%R.root"
fileSwitch : {
force : true
boundary : Run
}
}
o3 : { # Switch to new output file at the next SubRun after (at least)
# 1000 events have been written to the file
module_type : RootOutput
fileName : "out_%#.root"
maxEventsPerFile : 1000
fileSwitch : {
boundary : SubRun
}
}
}
For output module o1
, no output-file switching has been specified, so the process will create one output file. For o2
, whenever a new Run
is reached, the output file switches, closing the first output file according to the pattern "out_r%R.root"
. The last output module is the most complicated one: since the maxEventsPerFile
value is true
, and the fileSwitch.boundary
value is SubRun
, the output module will switch to a new output file after 1000 events have been written to the file and once the next SubRun
begins.
Output-file handling and (Sub)Run
products¶
Such flexibility, as described above, introduces output-file switching at potentially arbitrary times, splitting up runs and sub-runs into fragments. To be able to accommodate such fragmentation, and for the user to be able to interpret the Run and SubRun products correctly, infrastructure has been put in place that ties each product to the appropriate set of events or sub-runs. This is discussed here.