Project

General

Profile

Bug #23147

just_repeat_run.sh doesn't care what node localhost refers to

Added by John Freeman 3 months ago. Updated 23 days ago.

Status:
Reviewed
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/20/2019
Due date:
% Done:

100%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

Right now, just_repeat_run.sh refuses to run without the "--nostrict" option if:

  • The artdaq-based code has been changed since the run to be repeated
  • The known boardreaders list pointed at by DAQINTERFACE_KNOWN_BOARDREADERS_LIST is different than the list used during the run to be repeated

However, just_repeat_run.sh doesn't care if you're running DAQInterface on a different host than when the original run was performed. This can be problematic since it means that "localhost" refers to a different node than when the original run was performed. just_repeat_run.sh should account for this, and possibly require --nostrict if users try to repeat a run while running DAQInterface on a different node than was used for the original run.

Associated revisions

Revision 094f9734 (diff)
Added by John Freeman 24 days ago

JCF: Issue #23147, improvements to just_repeat_run.sh

The main complaint in the issue, that just_repeat_run.sh didn't care
if DAQInterface is being run on a host different than the one it ran
on during the run-to-be-repeated, is addressed in that this difference
is treated the same as a code difference or a known boardreaders list
difference: it won't run unless --nostrict is selected.

However, I've improved the output in that now, even if --nostrict HAS
been selected, all deviations are printed out. This prevents a
scenario in which someone selects --nostrict because they don't care
about one particular deviation, and then other deviations they may
care about get suppressed.

History

#1 Updated by John Freeman 24 days ago

  • % Done changed from 0 to 100
  • Status changed from New to Resolved

Now, if you execute just_repeat_run.sh on a node different than the node DAQInterface was on when the run you're trying to repeat was performed, you'll see a message like the following, which is self-explanatory:

Checking that DAQInterface is being run on the same host as was used for run 2993...
A difference was found between the host DAQInterface was run on for
run 2993 (mu2edaq11.fnal.gov) and the host you're
currently on (mu2edaq13). Consequently, any artdaq
process specified to run on "localhost" in either the boot file or the
known boardreaders list won't run on mu2edaq11.fnal.gov,
unlike run 2993. Unless you're running this script with the
--nostrict option, this attempt to repeat run 2993 will not
proceed.

#2 Updated by Eric Flumerfelt 23 days ago

  • Status changed from Resolved to Reviewed
  • Co-Assignees Eric Flumerfelt added

While trying to test this against an old run I had lying around, I got the following notification:

A difference was found between the host DAQInterface was run on for
run 2 (/home/eflumerf/Desktop/artdaq-mrb-base/DAQInterface) and the host you're
currently on (ironwork.fnal.gov).

Otherwise, code checks out, feature works correctly.

#3 Updated by John Freeman 23 days ago

Thanks for taking a look. Around a week ago I updated DAQInterface so it would prepend the host in front of the working directory as saved in metadata.txt, so if you try to repeat new runs, the script will say something along the lines of

A difference was found between the host DAQInterface was run on for
run 2 (mu2edaq13) and the host you're
currently on (ironwork.fnal.gov).

which is a bit more helpful in that it actually tells you what the host was for the old run.



Also available in: Atom PDF