mediumsystem_with_routing_master fails during integration testing
On mu2edaq12, the mediumsystem_with_routing_master test fails the 3 run x 60 s test. Problems appear to be related to stopping the DataLogger art process in the first run.
The following error message is sometimes printed by the DL art process at the start of the second run:
Error / ArtException 07-Feb-2019 10:46:10 CST mu2edaq12.fnal.gov (192.168.157.12) UDPMessage 33 / PID 47210 art / PostEndJob / ModuleEndJob cet::exception caught in art ---- OtherArt BEGIN ---- DataCorruption BEGIN readNext returned a new Run and Event without a SubRun ---- DataCorruption END ---- OtherArt END
#3 Updated by Eric Flumerfelt over 1 year ago
- Assignee set to Eric Flumerfelt
- Status changed from New to Work in progress
- Category set to Known Issues
I have added artdaq/bugfix/21863_ArtdaqInput_Tweaks, which resolves the error message seen by art. I believe the problems I am currently seeing are in part due to the fact that data is kept in the shared memory buffers between runs, causing art to switch back and forth between the first run and the second, exacerbating the disk-writing delay.
#8 Updated by John Freeman over 1 year ago
- Status changed from Resolved to Reviewed
Things are looking good at this point. I've added Issue-specific comments to Issue #21870 and Issue #21869. Here, I'll add that the art's "readNext" error hasn't showed up in my runs since I merged bugfix/21863_ArtdaqInput_Tweaks into /home/jcfree/artdaq-demo_test_fixes_to_v3_03_02/srcs/artdaq. The one dangling question about this config is covered in an Issue I just added, Issue #21908, but it may not be a big deal...