Project

General

Profile

Bug #21083

RoutingMaster does not retry socket connections or report errors

Added by Eric Flumerfelt 11 months ago. Updated 2 days ago.

Status:
Reviewed
Priority:
Normal
Category:
Known Issues
Target version:
-
Start date:
10/09/2018
Due date:
% Done:

0%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

RoutingMasterCore should print error messages to TRACE and MessageFacility when socket connections fail. It should also retry a reasonable number of times to establish sockets, especially in the case where the RoutingMaster is started asynchronously to other artdaq applications.

Fixes implemented in artdaq/bugfix/RoutingMasterCore_RetrySocketConnection.

History

#1 Updated by Kurt Biery 10 months ago

Eric,
What were the conditions in which the creation of the socket connection failed? Port already in use? Something else?
Thanks,
Kurt

#2 Updated by Eric Flumerfelt 10 months ago

I believe the most common was a port already in use error silently crashing the RM.

#3 Updated by Ron Rechenmacher 2 days ago

  • Status changed from Resolved to Reviewed

I can't reproduce any specific error. I tried starting up an instance of the routing_master and then another one
on a different rpc port and then sending one the standard config and the other the same and also tweaked config to
try to see if I could get the 2nd instance to trigger some of the new TLOGs.
To send the rpc commands, I used:

cd $MRB_TOP/run_records/75
sed_script='H;${x;s/\n//g;s/\([][{}\",:]\)/\\\1/g;p}'
xmlrpc_str_param=`grep -v '^#' RoutingMaster1.fcl | sed -n "$sed_script"`
xmlrpc http://localhost:14106/RPC2 daq.init "s/$xmlrpc_str_param" 

Since this is just adding TLOGs, I'm going to mark reviewed and merge into develop.



Also available in: Atom PDF