Art crashes in Dispatcher when buffers overwritten
On ICARUS, some Dispatcher art process crashes were recently noticed. (http://dbweb0.fnal.gov/ECL/sbnfd/E/show?e=19489)
On inspection of the core files, I found a hole in the logic where an event map containing nullptr to DataFragments could be passed to ArtdaqInputHelper, causing a segfault.
This happens when the buffer is taken from the reader after it has retrieved the Fragment types, while it is copying the data out of the shared memory into the art process' memory.
I have patched the hole and ArtdaqSharedMemoryService will retry the read, as it does when the buffer gets taken before reading the Fragment types.
#2 Updated by Eric Flumerfelt 2 months ago
I have addressed issues in SharedMemoryManager on this branch: artdaq-core:bugfix/25211_SMM_NoDetachFromMarkBufferEmpty. Basically, when the original situation occurs, the ArtdaqSharedMemoryService still has to try to release the buffer. This was causing a call to happen in SharedMemoryManager which failed (we already knew the buffer was in a bad state), but then caused SharedMemoryManager to detach from the shared memory, leading to the connected art process halting