Project

General

Profile

Bug #21856

eventbuilder_diskwriting and combined_eb_and_el simple_test_configs do not work in v3_03_02

Added by Eric Flumerfelt 9 months ago. Updated 8 months ago.

Status:
Reviewed
Priority:
Normal
Assignee:
Category:
artdaq-daqinterface
Target version:
-
Start date:
02/06/2019
Due date:
% Done:

100%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

Apparently, the bookkeeping has changed in a way that breaks the previously-working eventbuilder_diskwriting example. If the combined_eb_and_dl example is not supported behavior, it should be removed.

The problem appears to be in generating sources/destinations tables when an expected layer of the artdaq system is missing. In eventbuilder_diskwriting, there are no DataLoggers, but there is a Dispatcher. In combined_eb_and_dl, there are no EventBuilders.

Associated revisions

Revision ed68950e (diff)
Added by John Freeman 9 months ago

JCF: if a subsystem has eventbuilders and dispatchers but no dataloggers, connect the eventbuilder to the dispatcher

This is in respond to Issue #21856, where Eric noticed that runs with
the eventbuilder_diskwriting configuration had problems in that events
weren't making it from the eventbuilder to the dispatcher (no
dataloggers exist in the run). With this commit, I've added logic to
bookkeeping whereby, if DAQInterface determines there are no
dataloggers in a subsystem in question, then if there are dispatchers
they'll expect to receive events from eventbuilders rather than
dataloggers.

History

#1 Updated by John Freeman 9 months ago

eventbuilder_diskwriting should definitely be fixed as it should be possible to send events to dispatchers from eventbuilders. However, thinking about combined_eb_and_dl, perhaps the convention should be that if we have just one process doing the work of an eventbuilder and a datalogger, it should be an eventbuilder - in fact, this is the convention which was used on protoDUNE. Also, having said that, it sounds like eventbuilder_diskwriting already covers that functionality. Perhaps this makes combined_eb_and_dl redundant, in which case I'll delete it from the configs list.

#2 Updated by John Freeman 9 months ago

  • % Done changed from 0 to 100

With branch feature/issue21856_eventbuilders_can_send_to_dispatchers, I've done two things to DAQInterface:

1) When bookkeeping, if DAQInterface determines that a subsystem has eventbuilders and dispatchers but not dataloggers, it'll set up the sources and destinations tables so events go from eventbuilders to dispatchers. This has gotten the eventbuilder_diskwriting configuration to work (see, e.g., mu2edaq01:/home/jcfree/run_records/2119)

2) I've removed combined_eb_and_dl configuration, for the reasons I described above

#3 Updated by John Freeman 9 months ago

  • Status changed from New to Resolved

#4 Updated by Eric Flumerfelt 9 months ago

  • Co-Assignees Eric Flumerfelt added

Code review: No issues found

#5 Updated by Eric Flumerfelt 9 months ago

During testing, I get a segfault in the Dispatcher art process when it starts.

#6  0x00007f819655a0d2 in __assert_fail () from /lib64/libc.so.6
#7  0x00007f819be54f8d in art::Principal::Principal (this=0x3e107c0, pc=..., hist=..., presentProducts=..., mapper=..., reader=...)
    at /scratch/workspace/art-release-build/SLF7/debug/build/art/v2_11_01/src/art/Framework/Principal/Principal.cc:75
#8  0x00007f819be4b699 in art::EventPrincipal::EventPrincipal (this=0x3e107c0, aux=..., pc=..., presentProducts=..., history=..., mapper=..., rtrv=..., lastInSubRun=false)
    at /scratch/workspace/art-release-build/SLF7/debug/build/art/v2_11_01/src/art/Framework/Principal/EventPrincipal.cc:28
#9  0x00007f80838b834e in art::SourceHelper::makeEventPrincipal (this=0x3dce728, eventAux=..., history=...)
    at /scratch/workspace/art-release-build/SLF7/debug/build/art/v2_11_01/src/art/Framework/IO/Sources/SourceHelper.cc:87
#10 0x00007f800160ae5c in art::ArtdaqInput<art::NetMonWrapper>::readAndConstructPrincipal (this=0x3dce7c8, msg=..., msg_type_code=4, inR=0x85185b0, inSR=0x87e67c0, outR=@0x7ffdc5fd3048: 0x0, 
    outSR=@0x7ffdc5fd3040: 0x0, outE=@0x7ffdc5fd3038: 0x0) at /home/eflumerf/Desktop/artdaq-mrb-base/srcs/artdaq/artdaq/ArtModules/ArtdaqInput.hh:416
#11 0x00007f8001602b92 in art::ArtdaqInput<art::NetMonWrapper>::readNext (this=0x3dce7c8, inR=0x85185b0, inSR=0x87e67c0, outR=@0x7ffdc5fd3048: 0x0, outSR=@0x7ffdc5fd3040: 0x0, outE=@0x7ffdc5fd3038: 0x0)
    at /home/eflumerf/Desktop/artdaq-mrb-base/srcs/artdaq/artdaq/ArtModules/ArtdaqInput.hh:591

Principal.cc:

Principal::Principal(ProcessConfiguration const& pc,
                     ProcessHistoryID const& hist,
                     cet::exempt_ptr<ProductTable const> presentProducts,
                     std::unique_ptr<BranchMapper>&& mapper,
                     std::unique_ptr<DelayedReader>&& reader)
  : processConfiguration_{pc}
  , presentProducts_{presentProducts}
  , branchMapperPtr_{std::move(mapper)}
  , store_{std::move(reader)}
{
  if (!hist.isValid()) {
    return;
  }
  assert(!ProcessHistoryRegistry::empty());
  ProcessHistory ph;
  bool const found[[gnu::unused]]{ProcessHistoryRegistry::get(hist, ph)};
  assert(found); // Line 75
  std::swap(processHistory_, ph);
}

#6 Updated by Eric Flumerfelt 9 months ago

I think this might be due to the fact that the Dispatcher is receiving events from two different art processes with different configuration.

[eflumerf@ironwork artdaq-mrb-base]$ cd run_records/1/
[eflumerf@ironwork 1]$ ls
boot.txt  component01.fcl  component02.fcl  Dispatcher1.fcl  EventBuilder1.fcl  EventBuilder2.fcl  known_boardreaders_list.txt  metadata.txt  ranks.txt  setup.txt
[eflumerf@ironwork 1]$ diff EventBuilder*
2c2
< #   Input  : ./EventBuilder1.fcl
---
> #   Input  : ./EventBuilder2.fcl
67c67
<       fileName: "/home/eflumerf/Desktop/artdaq-mrb-base/daqdata/artdaqdemo_eb00_r%06r_sr%02s_%to_%#_eb1.root" 
---
>       fileName: "/home/eflumerf/Desktop/artdaq-mrb-base/daqdata/artdaqdemo_eb00_r%06r_sr%02s_%to_%#_eb2.root" 

#7 Updated by Eric Flumerfelt 9 months ago

I get the following errors trying to run a different configuration:

Fri Feb  8 16:21:07 CST 2019: CONFIG transition underway
Config name: demo

Obtaining FHiCL documents...done (0.0 seconds).
Bookkeeping the FHiCL documents...Traceback (most recent call last):
  File "/home/eflumerf/Desktop/artdaq-mrb-base/srcs/artdaq_daqinterface/rc/control/daqinterface.py", line 1607, in do_config
    self.bookkeeping_for_fhicl_documents()
  File "/home/eflumerf/Desktop/artdaq-mrb-base/srcs/artdaq_daqinterface/rc/control/bookkeeping.py", line 376, in bookkeeping_for_fhicl_documents_artdaq_v3_base
    "\n } \n" + \
  File "/home/eflumerf/Desktop/artdaq-mrb-base/srcs/artdaq_daqinterface/rc/control/bookkeeping.py", line 302, in create_sources_or_destinations_string
    elif not procinfo_subsystem_has_dataloggers and "Dispatcher" in procinfo_to_check.name and nodetype == "destinations":
UnboundLocalError: local variable 'procinfo_subsystem_has_dataloggers' referenced before assignment

#8 Updated by John Freeman 9 months ago

Concerning the local variable 'procinfo_subsystem_has_dataloggers' referenced before assignment error: the problem there is this code snippet from bookkeeping.py:

        procinfo_subystem_has_dataloggers = True
        if len([pi for pi in self.procinfos if pi.subsystem == procinfo.subsystem and pi.name == "DataLogger"]) == 0:
            procinfo_subsystem_has_dataloggers = False

...where on that first line, I leave out an "s" in "subsystem". The upshot of this is that if a subsystem DOES have a datalogger (like in the demo config, which I didn't test with, but unlike in the eventbuilder_diskwriting config, which I did test with), the procinfo_subsystem_has_dataloggers variable doesn't get set. Fixed by commit 79ef998d00eeb80272622cbed35fc036235a5ca2 on feature/issue21856_eventbuilders_can_send_to_dispatchers. Moral of the story: after a code change, test all control paths!

#9 Updated by Eric Flumerfelt 8 months ago

I have made a couple changes to the fhicl configuration for the eventbuilder_diskwriting simple_test_config, and changed the default for the disable_unique_rootfile_labels option. With these changes, the eventbuilder_diskwriting example now works correctly.

#10 Updated by Eric Flumerfelt 8 months ago

  • Status changed from Resolved to Reviewed


Also available in: Atom PDF