In the event of ssh problems, DAQInterface should avoid hangs
During the boot transition, DAQInterface makes use of ssh for a number of reasons - to determine the names of the artdaq process logfiles, to test the sourcing of the DAQ setup script on a different node, etc. If the ssh requires a password under the user's account, then DAQInterface will appear to hang since it's obviously not able to interactively enter the user's password. An effort should be made to prevent this hang, via a timeout, followed by an informative error message concerning why DAQInterface was unable to proceed with the boot transition.
#1 Updated by John Freeman 10 months ago
- % Done changed from 0 to 100
- Status changed from New to Resolved
Resolved with commit 5e42e28fcdfe48abf64be2460edcb2968e30ae6f at the head of bugfix/23403_avoid_ssh_hangs. Now, instead of potentially causing a hang, the ssh calls in the boot transition are given a 30-second timeout. If they return 124, indicating that the timeout got hit, you'll see a message like the following:
Nonzero value (124) returned in attempt to source script /home/jcfree/artdaq-demo_v3_06_00/setupARTDAQDEMO on host "mu2edaq05"; returned value suggests that the ssh call to mu2edaq05 timed out. Perhaps a lack of public/private ssh keys resulted in ssh asking for a password?
(if the failure occurred when trying to source the DAQ setup script on a different node)
Returned value of 124 suggests that the ssh call to mu2edaq05 timed out. Perhaps a lack of public/private ssh keys resulted in ssh asking for a password?
(if the failure occurred when trying to mkdir -p the logfile directories on a different node)
Either way, DAQInterface will deposit you back in the Stopped state rather than hanging should this occur.