Project

General

Profile

Bug #25211

Art crashes in Dispatcher when buffers overwritten

Added by Eric Flumerfelt 2 months ago. Updated about 2 months ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
11/16/2020
Due date:
% Done:

0%

Estimated time:
Experiment:
ICARUS
Co-Assignees:
Duration:

Description

On ICARUS, some Dispatcher art process crashes were recently noticed. (http://dbweb0.fnal.gov/ECL/sbnfd/E/show?e=19489)

On inspection of the core files, I found a hole in the logic where an event map containing nullptr to DataFragments could be passed to ArtdaqInputHelper, causing a segfault.
This happens when the buffer is taken from the reader after it has retrieved the Fragment types, while it is copying the data out of the shared memory into the art process' memory.

I have patched the hole and ArtdaqSharedMemoryService will retry the read, as it does when the buffer gets taken before reading the Fragment types.

History

#1 Updated by Eric Flumerfelt 2 months ago

Change on artdaq:bugfix/25211_ASMS_RetryBufferWhenReadingFragments

#2 Updated by Eric Flumerfelt 2 months ago

I have addressed issues in SharedMemoryManager on this branch: artdaq-core:bugfix/25211_SMM_NoDetachFromMarkBufferEmpty. Basically, when the original situation occurs, the ArtdaqSharedMemoryService still has to try to release the buffer. This was causing a call to happen in SharedMemoryManager which failed (we already knew the buffer was in a bad state), but then caused SharedMemoryManager to detach from the shared memory, leading to the connected art process halting

#3 Updated by Eric Flumerfelt about 2 months ago

  • Target version set to artdaq v3_09_03
  • Status changed from Resolved to Closed

Also available in: Atom PDF