Project

General

Profile

Bug #25320

When a critical artdaq process dies AND the trace script returns nonzero, DAQInterface hangs

Added by John Freeman 3 months ago. Updated about 2 months ago.

Status:
Reviewed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
12/14/2020
Due date:
% Done:

100%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

On December 6th, icarus' run 3894 hung for the following reason:

  • A boardreader process died
  • As expected, upon realizing this, DAQInterface entered the recover transition, designed to wind down the artdaq processes as cleanly as possible
  • Since a trace script was meant to be executed on start, stop and recover (referred to by the DAQINTERFACE_TRACE_SCRIPT variable), DAQInterface called it
  • For reasons irrelevant to this issue, the trace script returned nonzero, i.e., an error state
  • DAQInterface responded by throwing an exception which abruptly cancelled the recover transition and was uncaught

The next time (A) a critical process dies, and (B) the trace script returns in an error state, DAQInterface should continue winding down cleanly.

Associated revisions

Revision 051abea6 (diff)
Added by John Freeman 3 months ago

JCF: Issue #25320: if a critical artdaq process dies and the trace script call returns nonzero, don't hang

History

#1 Updated by John Freeman 3 months ago

  • % Done changed from 0 to 100
  • Status changed from New to Resolved

Resolved with feature/25320_handle_process_death_and_trace_error (commit 051abea6c9289f54335f36b940b4b7facca6a1e4)

#2 Updated by Wesley Ketchum about 2 months ago

We've validated that this works in SBN-FD.

#3 Updated by John Freeman about 2 months ago

  • Status changed from Resolved to Reviewed

Also available in: Atom PDF