Can the ProtoDUNE variant of DAQInterface be absorbed into standard DAQInterface?
Since 2-3 years ago it was decided that JCOP, rather than DAQInterface, would be the artdaq process control mechanism on ProtoDUNE, there's been a special ProtoDUNE variant of DAQInterface, resident on the feature/protodune branch. This differs from the standard DAQInterface based on the develop branch, in that it makes assumptions about the files JCOP produces as well as some of the quirks on ProtoDUNE (more on this in a moment).
Ron suggested that it would be worth investigating the possibility of finding a way to extend/improve standard DAQInterface so that it can meet ProtoDUNE's needs. Reflecting on this, there are a few challenges which will need to be tackled, but it may be possible to do so. For example:
-Since JCOP controls the artdaq processes, ProtoDUNE DAQInterface needs to be told what the ranks of the processes are so that it can perform bookkeeping; in the standard DAQInterface case, it assigns the ranks itself.
-On ProtoDUNE, there's a distinction between the idea of the Run Control partition number and the partition_number FHiCL parameter which DAQInterface overrides in the timing boardreader FHiCL document. Basically, the latter is the former modulo 4, so if we're on Run Control partition number 6, then partition_number in the timing boardreader gets set to 2. Standard DAQInterface doesn't have this concept.
-When standard DAQInterface bookkeeps unique root file labels, it distinguishes between eventbuilders and dataloggers as logging processes - i..e, it adds "_eb<N>" to eventbuilder-written root filenames, and "_dl<N>" to datalogger-written root filenames. For ProtoDUNE DAQInterface, since we didn't want to confuse offline people when we switched over from using dataloggers to eventbuilders as the logging processes, the eventbuilder-written root files have the "_dl<N>" added, not the "_eb<N>"
We should discuss the feasibility of getting standard DAQInterface to meet ProtoDUNE's needs.
#1 Updated by John Freeman about 2 years ago
Another issue to consider: the ProtoDUNE variant of DAQInterface uses bookkeeping which is very clearly ProtoDUNE-specific. E.g., it expects users to set "
zmq_fragment_connection_out" in boot.txt, and then it uses that value not only to set the zmq_fragment_connection_out FHiCL parameter in the configurations, but also to set the TriggerRequestAddress parameter used by the SSP fragment generators, thanks to a request from Jen and Giovanna on March 11. We could just allow users to override any FHiCL parameters they want in the boot file, though the downside of that is that it could be open to abuse in that experimenters might just set parameters on-the-fly in the boot file. This would weaken the amount of information conveyed by knowledge of what config label was used in a given run (see also my comments for Issue #19991)
#2 Updated by John Freeman about 2 years ago
- % Done changed from 0 to 90
On the ProtoDUNE cluster, with the head of the forked-off-of-develop feature/issue22137_daqinterface_usable_for_protodune branch (3e28635914f740873c54f9f118a75e5d9f0165be), I'm able to run JCOP with ToySimulators and recreate the behavior I get if I use the ProtoDUNE variant of DAQInterface they use on the cluster (v3_00_06o). See, e.g., np04-srv-024:/nfs/sw/artdaq/run_records/7501 and 7502. Note that to run the feature branch, I need to (temporarily!) overwrite the usual /nfs/sw/artdaq/DAQInterface/source_me5 with the contents of /nfs/sw/artdaq/DAQInterface/source_me5.john
#3 Updated by John Freeman about 2 years ago
To the reviewer of this issue: in order to test this on the ProtoDUNE cluster, you'll need to use the DAQInterface in my working area for the issue branch, /nfs/home/np04daq/.jcfree/artdaq-utilities-daqinterface, rather than the standard DAQInterface which JCOP uses. If you're planning to run on partition <N>, the way to do this is:
cd /nfs/sw/artdaq/DAQInterface cp -p source_me<N> source_me<N>.backup cp -p source_me5.john source_me<N> sed -r -i 's/DAQINTERFACE_PARTITION_NUMBER=5/DAQINTERFACE_PARTITION_NUMBER=<N>/' source_me<N>
...and then launch JCOP. Of course, remember to restore the original source_me<N> when you're done. I talked to Enrico and he's put an if statement into JCOP's logic so it knows whether it's dealing with the standard ProtoDUNE cluster DAQInterface or this issue's DAQInterface (https://its.cern.ch/jira/browse/NP04DAQ-17). When comparing the results of testing with the results from standard JCOP usage, be aware that this issue branch version of DAQInterface doesn't include the obsolete portOffset FHiCL variable that the ProtoDUNE variants (continue to) do.
#5 Updated by Kurt Biery about 2 years ago
I've tested this new version of DAQInterface at protoDUNE in Partition 5 following John's instructions. I'll still need to test it in a partition with real hardware to verify that the Timing System partition number and the ZeroMQ offset will be correctly used by the appropriate BoardReaders. I'll try to do that tomorrow.
One issue that I noticed is that when I set disable_unique_rootfile_labels to false in this test, the suffix that was appended to the filename was "_eb1". It wasn't totally clear to me whether that was expected or not, but I naively expected "_dl1" instead of "_eb1".
#7 Updated by Kurt Biery almost 2 years ago
I tested the new version of DAQInterface in Partition 0 at protoDUNE. The runs were
7588 - baseline with existing DAQInterface
7589 - new DAQI, but I forgot to enable triggers
7590 - new DAQI
7591 - new DAQI
7592 - baseline after restoring the standard version of DAQI
For these tests, I needed to modify /nfs/sw/artdaq/DAQInterface/source_me0, and I restored the original version of the source_me0 file after the test.
This new version of DAQInterface merges the functionality of the mainline development of this application with the protoDUNE-specific version that had evolved for the beam running.
The tests showed that this new version appears to have equivalent behavior at protoDUNE as the existing version.
I confirmed that the saved configurations look identical (modulo timestamps and the no-longer-needed portOffset parameter that John mentioned).
So, I would say that these changes have been successfully validated.