Necessary Maintenance #18997
Verify that the run and subrun level data products are written during the proper art phase
Data products associated to the run and subrun may be written (that is,
put() into the art principal) either at the beginning or at the end of their reference period (run/subrun).
Each data product is associated to a range of events: the data products that are written at the beginning of the period are associated to the entire period, while the ones written at the end are associated to the exact range of events actually visited.
While the preferred practice is to write the product at the end of the period, there may be reasons to choose either of the behaviour.
Also, the range of associated events matters when trying to process multiple files, in which each has its own version of the data product, but multiple of them refer to the same period. An example may be a summary data product collecting the POT on a run: one file
[A] might contain a summary for events 1 to 50 of run 1, another file
[B] a summary for events 51-100 of the same run 1, and a third
[C] again events 1 to 50 of run 1.
[B] together, the summary data should be aggregate in a record covering run 1 from event 1 to 100, while when processing file
[C] together the two records should be identical, and either should be used (although in this example the safer choice might be to forbid processing those two files together). If the information were saved as associated to the whole run, there would be no chance to actually do that.
This ticket is about verifying that the products in run and subrun from LArSoft code are put into art at the right time.
#2 Updated by Gianluca Petrillo over 2 years ago
- Status changed from New to Assigned
- Assignee set to Gianluca Petrillo
I have identified the following run/subrun products:
The first two data products are in LArSoft. Proper
aggregate() methods have been defined as solution to issue #18943.
The other data product is effectively a
std::vector, so there is no way to add an
aggregate() function directly to the object. An aggregation is provided by default by art, that is to concatenate the vectors. The author of the best class there's ever been,
MuCS::MuCSDTOffset (Matt Bass), should consider if this is the desired behaviour. If not, further directions will be needed by art experts on how to direct art to a custom aggregation free function.
That data product is written at the end of the run by
#3 Updated by Gianluca Petrillo over 2 years ago
- % Done changed from 0 to 70
put() into each subrun by the modules
TGMuon (argoneutcode:source:TGMuon/TGMuon_module.cc) and
In all cases,
put() call happens at the end of the subrun.
No further action is needed.
#5 Updated by Gianluca Petrillo over 2 years ago
- % Done changed from 70 to 100
The data product
sumdata::RunData is written by 24 modules. All of them write it at the beginning of the run. In this way, the data product will be associated with the whole run.
This is an acceptable solution because the purpose of the data product is to capture configuration that is never changed during each run.
The aggregation method requires that the configuration of the different fragments being aggregated is consistent. This is also the correct behaviour, as having two different configurations in the same run is incorrect. If this causes problems with the Monte Carlo simulation, either this choice or the experiment workflow should be reconsidered.
#6 Updated by Gianluca Petrillo over 2 years ago
All the instances of run and subrun products in LArSoft have been checked, and no further action was required.
I have also checked the experiment code, and opened a ticket for the only such data product found (issue #18998).
This concludes the work on this issue.