Bug #21897

One event per subrun appears to fail if a fragment generator in Buffer mode is added

Added by John Freeman over 1 year ago. Updated over 1 year ago.

Known Issues
Target version:
Start date:
Due date:
% Done:


Estimated time:


This phenomenon first appeared on the novabeamlinedaq test stand, but I've been able to recreate it using ToySimulators. Using artdaq v3_03_02 and DAQInterface at the current head of its develop branch, 19670a6037c44bf096f7c9d085ce7935f7957b1d, if I modify simple_test_config/subconfigs/dataloggers/datalogger_art_standard.fcl so that it supports many (60) subruns per file, like on novabeamlinedaq:

-         maxEvents: 1000
-         maxRuns: 1
-         maxSize: 8.192e6
-         maxSubRuns: 1
+         maxSubRuns: 60

...and then run only with the component_one_event_per_subrun boardreader and the configuration subconfigs/component_pull subconfigs/component_push subconfigs/component_special subconfigs/dataloggers subconfigs/dispatchers subconfigs/eventbuilders subconfigs/metrics_disabled subconfigs/routingmaster_disabled, then things work as expected, i.e., there's one event per subrun, each event having the same number as the subrun number. However, if I then add in component_buffer_mode, it appears we stay stuck on subrun #1. It would be good to understand this phenomenon, so it can be fixed...

Associated revisions

Revision 8bf2aa91 (diff)
Added by Michael Wallbank over 1 year ago

JCF: compressed form of Eric's Issue #21897 bugfix, grafted onto v3_03_01


#1 Updated by Eric Flumerfelt over 1 year ago

  • Assignee set to Eric Flumerfelt
  • Status changed from New to Resolved
  • Category set to Known Issues

This actually turned out to be much harder to resolve than initially thought. Essentially, the limitations of the "rollover NOW" model were being exposed by running in this mode, and I had to re-think the mechanism for rolling over subruns.

On artdaq/bugfix/21897_SMEM_SubrunRolloverFix, I have completely re-done the way that subrun rollovers are handled. Now, the EndOfSubrunFragment informs SMEM not only of the sequence ID that the subrun number should change at, but also the subrun number it should change to. This allows SMEM to keep track of which subrun a given sequence ID belong in, and will resolve issues with out-of-order Fragments being placed in the wrong subrun.

I have also changed the logic in ToySimulator on branch artdaq-demo/bugfix/21897_ToySimulator_SubrunRolloverFix to generate EndOfSubrun Fragments with appropriate sequence ID and subrun information (one of the changes is that the EndOfSubrun Fragment expects the new subrun number to be in the Fragment's timestamp field, something which we will have to make sure to thoroughly document).

#2 Updated by Eric Flumerfelt over 1 year ago

I have tested these changes with rollover intervals of 1-5 events, and made sure that the appropriate events ended up in the appropriate subruns.

#3 Updated by John Freeman over 1 year ago

  • % Done changed from 0 to 100

On Friday, Eric and I looked at a problem on the novabeamlinedaq teststand where even though it was using the bugfix/21897_SMEM_SubrunRolloverFix code, it was still the case that for each new event the subrun # wasn't automatically incrementing. The teststand consists of a fragment generator pushing fragments at a rate of one per minute, and a fragment generator being run in request_mode: buffer producing them at roughly 2 per second (not minute). He determined that the problem could be solved by having getNext_() always return the data fragment with sequence ID "N" with the end-of-subrun fragment with sequence ID "N+1". For that reason, I've changed the logic in ToySimulator::getNext_() in bugfix/21897_SMEM_SubrunRolloverFix so that it does the same thing (commit df3dbcaa2858091b29657f6c878209cd9f9e06f1). The root files produced in runs 2308-2312 on mu2edaq (5 events each run, with two fragment generators at rates and request modes analogous to novabeamlinedaq) indicate things look fine - details in /home/jcfree/run_records/2308, etc.

#4 Updated by John Freeman over 1 year ago

  • Status changed from Resolved to Reviewed

I've merged bugfix/21897_ToySimulator_SubrunRolloverFix into the develop branch in the central repo. As the code change introduced turns out to be trivial enough that it can be inspected by eye, I'm considering the develop branch to be valid in the same way bugfix/21897_ToySimulator_SubrunRolloverFix is based on the tests described above, and consider this Issue to have been reviewed.

#5 Updated by John Freeman over 1 year ago

We're not 100% done: for artdaq, bugfix/21897_SMEM_SubrunRolloverFix needs to be merged into develop, and it appears there are code conflicts...

#6 Updated by John Freeman over 1 year ago

I've resolved the conflict mentioned in the last comment after the merge. The current head of artdaq develop is now 45485d2e0be13fd829bd7042d357a0bb0787fe19. I've successfully tested this artdaq using runs 2326 through 2330.

#7 Updated by Eric Flumerfelt over 1 year ago

  • Target version set to artdaq v3_04_00
  • Status changed from Reviewed to Closed

Also available in: Atom PDF