DAQInterface should handle SIGHUP, SIGTERM, etc. as gracefully as possible
This Issue is motivated by Eric's Issue #22095, in which he observed that sometimes when using run_demo.sh not all artdaq processes (especially datalogger) appeared to get cleaned up. Perhaps related, running the demo config with component01 and component02 on woof, what I've found is that when it's in the running state, whether DAQInterface is controlling processes in "pmt" or "direct" mode, if it receives a SIGHUP or a SIGTERM then while the python script that is DAQInterface disappears, the artdaq processes (and in the case of "pmt" mode, pmt.rb) remain. While it's the case that if you relaunch and then try to run again with the same processes, DAQInterface will clean up the processes after complaining and then put itself back in the "stopped" state, there's of course no guarantee in the real world that this action will be taken subsequent to an unexpected DAQInterface killing. The possibility of DAQInterface catching kill signals and then gracefully winding down active artdaq processes should be investigated.
JCF: improve the logic used to ensure that daqinterface.py itself dies after it's cleaned everything up after catching a signal (see Issue #22146 comments from today)
JCF: implement Ron's suggested change so that DAQInterface aliases to an executable script, not a sourced script
As you can see from the diff, this is a very simple change. I've
performed a few regression tests and confirmed the following remains
-You can run two DAQInterface instances in the background at the same
time on separate partitions in the same terminal, using the
DAQINTERFACE_PARTITION_NUMBER environment variable to control which
one you're sending transitions to.
-If you close the terminal DAQInterface is running in, or hit Ctrl-c
on it (if it's running in the foreground), then the root file closes
correctly (Issue #22146)
-Output goes simultaneously to the screen and to the file referred to
#1 Updated by John Freeman about 2 years ago
- % Done changed from 0 to 100
- Status changed from New to Resolved
Resolved at the head of the feature/issue22146_handle_signals branch, commit a9bbd7dcaa1a27c09798f88ef4d0c1a7f82b9576.
DAQInterface will now enter the recover transition (i.e., sending a stop and then a shutdown to artdaq processes found in the running state before killing them, most notably resulting in a correctly-saved root file) if it receives any of the following signals:
-SIGINT, meaning that if DAQInterface is running in the foreground in a terminal and you hit Ctrl-c
-SIGHUP, meaning you close the terminal DAQInterface is running in
-SIGTERM, meaning you kill DAQInterface (by ignoring the are-you-sure warning you get when DAQInterface isn't in the "stopped" state but you try killing it via the
#2 Updated by Eric Flumerfelt about 2 years ago
I've noticed a few cases where closing the DAQInterface window has led to a python process remaining active, with /tmp/daqitnerface-$USER/DAQInterface_partition*.log showing no activity. I'm not sure what the workaround might be, other than making sure to proceed with default handlers after running the DAQInterface signal handler...
def_term_handler = signal.SIG_DFL
def_hup_handler = signal.SIG_DFL
def_int_handler = signal.SIG_DFL
+++ def_term_handler(signum, stack)
+++ else if signum ...
def_term_handler = signal.signal(signal.SIGTERM, handle_kill_signal)
#3 Updated by John Freeman about 2 years ago
To address Eric's findings, with commit 3d6d95ef1a9b10169285b8bab25de68f2e024752 on feature/issue22146_handle_signals, after putting itself through the recover transition, DAQInterface will then call the default signal handler, and then as an insurance policy call os._exit, which is a harder exit than sys.exit.