Example of product aggregation:
POT summary information

A real use case where product aggregation is of benefit was raised by the NOvA experiment. NOvA's workflow is such that SubRuns are atomic across files -- i.e. a SubRun is not split up across multiple art/ROOT files. However, as is common in various experiments, Runs are split up across files. NOvA has a SubRun product that corresponds to the number of protons on target (POT summary). Since SubRuns are not split up across files (in NOvA's case), all product retrievals of the POT summary give meaningful information. However, creating a POT-summary Run product would be problematic since current versions of art (before art:version:"Arcturus") cannot easily present a Run-product value that corresponds to a complete, processed Run.

With art:version:"Arcturus", the mechanisms are now in place to be able to present to the user a POT summary that corresponds to the complete, processed Run, even if split across many art/ROOT files1. See below for an illustration.

1 The same functionality exists in art:version:"Arcturus" for SubRun products where the SubRun spans multiple art/ROOT files.

Process 1

In the first process, four art processes are executed independently of each other, with distinct sub-run numbers, but the same run number. For each process, a run product is created that corresponds to the total number of protons on target (POTSummary) seen for the processed span of events. For each process, there is one output file, to which the POTSummary run product is written. This can be depicted with this diagram:

In the image above, the POTSummary run products are labeled with the corresponding run and subrun numbers (r1, sr[0-3]) to emphasize that the products were produced for a given set of events/subruns.

Process 2

The second process is a concatenation process, where the output files from process 1 ("p[0-3].root"), are concatenated into a smaller set of output files ("q01.root" and "q23.root"). This is depicted as:

The products from the input files have been carried forward to the output files, along with the information that identifies to which span of events each product corresponds. In other words, both q*.root output files now have two instances of a run-1 product.

Process 3

For the third process, the q*.root files from process 2 are concatenated, again carrying forward the POTSummary run products. Upon reading q01.root, art determines that there are two POTSummary products associated with run 1. In versions of art before art:version:"Arcturus", only the first product would have been retained. For versions art:version:"Arcturus" and newer, both products will be retained, and since their ranges-of-validity are disjoint, they will be aggregated according to the behavior described here, and their ranges of validity will be combined. Since for each input-file read, the POTSummary products from the open input file are aggregated, the resulting output file (r0123.root) will have two copies of the POTSummary: one corresponding to the aggregation from q01.root, and then one corresponding to the aggregation from q23.root.

Process 4

As in process 3, when reading r0123.root, two POTSummary products are detected, and since their ranges-of-validity are disjoint, the products and corresponding ranges are aggregated. The final output file (final.root) will then have one POTSummary Run product that corresponds to Run 1, SubRuns 0 through 3, having been aggregated according to the user-defined POTSummary::aggregate function.

N.B. It is not necessary to execute process 4 before performing a Run::getByLabel on the aggregated POTSummary product. It is sufficient just to have produced file "r0123.root", and then whenever that file is read and a user attempts to fetch a product, the aggregated product will be presented. The purpose of this illustration is to show that if a concatenation job is performed on just the "r0123.root" input file, the output file will contain only the aggregated product, and not the two separate products as in "r0123.root".