Project

General

Profile

Feature #4353

Feature #3956: We should protect against all module failures at end run so that files get closed correctly

Ensure that disk files always get closed no matter how the DAQ is shut down [art-related]

Added by Kurt Biery over 6 years ago. Updated over 6 years ago.

Status:
New
Priority:
High
Assignee:
Target version:
-
Start date:
07/17/2013
Due date:
07/31/2013
% Done:

0%

Estimated time:
Duration: 15

Description

This has been a long-standing request that I am finally capturing in this Issue. Some related discussion was captured in Issue #3956.

As everyone can understand, this request is rather broad. However, there are some very concrete steps that can be taken to improve the reliability of closing disk files.

Here are some notes from a discussion on 02-July between Marc, Chris, and Kurt:

1) exceptions are already handled by art, but in the case of artdaq/ds50daq, art is run in a thread, and it may not be clearly defined how signals are sent to the different threads
1.1) a recommendation was made to set the thread mask so that only the main threads gets signals, and it puts the right thing(s) on the queue to tell art how to react
1.2) Jim's MPI/PMT shim may be needed to get the most reliability that we can
1.3) (internal) questions include: how could a fatal error in one part of the MPI program get turned into a graceful shutdown in another part?
1.4) Possible action items:
1.4.1) investigate/improve how signals and interrupt handling is done
1.4.2) improve the way that PMT responds to errors and signals, including Jim's shim

In further discussions, the following concrete tasks were identified:
  1. Document signal handling within art, and ensure via tests that response to signals within art executables is consistent and as desired
  2. Document the pattern that should be used by executables that run art in a thread as part of a broader application and investigate whether existing artdaq/ds50daq executables are currently following this pattern. The goal is to have "signal handling within artdaq/ds50daq executables consistent and sufficient to lead to an orderly shutdown of the executables (including any art threads) as quickly as possible". [quote from an email from Chris]

Related issues

Related to art - Feature #4356: Document the pattern that artdaq applications should use to correctly handle signals [ds50daq-related]Closed07/17/201307/31/2013

Related to art - Feature #4355: Document and verify the signal handling within art [ds50daq-related]Closed07/17/201307/31/2013

History

#1 Updated by Kurt Biery over 6 years ago

This Issue is related to Issues #4355 and #4356 in the art project.



Also available in: Atom PDF