Project

General

Profile

Feature #22810

It would be helpful for FragmentGenerators to have access to Requests

Added by Kurt Biery 6 months ago. Updated 18 days ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
06/25/2019
Due date:
% Done:

0%

Estimated time:
Experiment:
-
Co-Assignees:
Duration:

Description

Currently, the data requests that are received by BoardReaders are handled by core BR code (in CommandableFragmentGenerator).

Folks on protoDUNE/DUNE have requested that FragmentGenerators have access to data requests. This would be helpful in cases in which it is not practical to return all of the data that is received from the hardware in getNext_() calls. (For example, in the protoDUNE FELIX FragmentGenerator, there isn't enough CPU power to compress all of the data coming from the FELIX card, so the FragGen only compresses the data of interest. Currently there is a back-door way to get those windows of interest to the FragGen, and it would be nice to switch to a way of doing that which is more officially sanctioned/supported.)

Several of us have discussed this a bit. Eric suggested a protected method in CFG that returns a copy of the pending requests. In talking with Roland and Phil at protoDUNE, they requested that we have two methods, one which returns the full list of pending requests (and may contain duplicates between calls) and one which returns a single "next" request (and would not return duplicates). Something like getRequests() and getNextRequest().

History

#1 Updated by Eric Flumerfelt about 1 month ago

  • Assignee set to Eric Flumerfelt
  • Status changed from New to Resolved

Implementation on artdaq:feature/22810_CFG_getRequestReceiver

#2 Updated by Kurt Biery 28 days ago

Eric, John,
I'd like to test this new functionality in the artdaq-demo.

One way to do that would to add a "lazy" mode to the existing ToySimulator_generator in artdaq_demo. The idea behind that mode would be to only return fragments from getNext_() once requests have been received. A good double-check would be to only enable that functionality if the ToySimulator is in pull mode.

Another option would be to create a new generator class in the demo. Something like LazyToySimulator_generator.

Thoughts?
Thanks,
Kurt

#3 Updated by John Freeman 26 days ago

Two thoughts:

1) I like the idea of a lazy mode for ToySimulator - we could just have some boolean which defaults to false, but which can be set to true in the FHiCL. If the boolean's true, then at the top of getNext_() - which already knows the sequence ID it'll assign the fragment it would create - there could be a periodic call to getRequests to see if it returns a request for the sequence ID in question. If the first request which appears concerns a higher sequence ID than the one getNext_ is sitting on, getNext_ could return having created no fragments.

2) I realize that we're supposed to keep the customer happy, but having a function "getNextRequest()" which literally just returns the first entry of the container returned by "getRequests" seems like a classic case of interface bloat.

#4 Updated by Eric Flumerfelt 26 days ago

For your second point, I'd like to point out that GetRequests does a deep copy of the requests map, so there is an inefficiency in calling GetRequests().begin() versus the current implementation of GetNextRequest() (though I imagine that in most cases this inefficiency would be vanishingly small).

#5 Updated by Kurt Biery 26 days ago

Regarding John's second point,
getNextRequest is supposed to be more substantial than just returning the latest request. It's supposed to be the the most recent request that hasn't already been returned by an earlier getNextRequest call. So, there would be some real work in that method...

#6 Updated by Eric Flumerfelt 26 days ago

  • Status changed from Resolved to Assigned

#7 Updated by Eric Flumerfelt 26 days ago

  • Status changed from Assigned to Resolved

I have implemented new logic for GetNextRequest which has Kurt's requested functionality.

#8 Updated by John Freeman 25 days ago

I've created an artdaq-demo branch with the same name as the artdaq feature branch, feature/22810_CFG_getRequestReceiver. On it, the ToySimulator has a lazy_mode FHiCL parameter which defaults to false, but when set to true, its getNext_ function only creates an artdaq fragment if the timestamp of the data matches the timestamp of the most recent request.

#9 Updated by Kurt Biery 20 days ago

As discussed in email, I've provided some modified code in ToySimulator to test GetNextRequest, and I've added some code that tests GetRequests (there is a preprocessor parameter that chooses which one gets run).
These changes have been committed to the feature/22810_CFG_getRequestReceiver branch in the artdaq-demo repo.

To test the functionality of the original code changes (in artdaq) and the changes in artdaq-demo to help test this, I've focused on the metrics that report the buffer occupancy in various BoardReaders when running some of our simple_test_configs. I'll capture information on those tests in later entries in this Issue.

#10 Updated by Kurt Biery 20 days ago

To test these changes, I used an artdaq-demo software area that was based on the head of the develop branches (as of late last week) of the various artdaq packages with e17, s83. In addition,
  • the artdaq_core product_deps file was modified to make the version of artdaq_core appear to be v3_05_07 instead of v3_05_08
  • the artdaq code was from the head of the feature/22810_CFG_getRequestReceiver branch, as of today
  • the artdaq_demo code was from the head of the feature/22810_CFG_getRequestReceiver branch, as of today

The artdaq-utilities-daqinterface software that was used was also from the head of its develop branch, as of today.

I first used the request_based_dataflow_example simple_test_config...
  1. I modified the component01, 02, and 03 config files to make use of the "usecs_between_sends" param instead of the "throttle_usecs" param since the former is more robust. I kept the specified time amounts the same, though (10 Hz for component01, 1 Hz for component02, and 50 Hz for component03)
  2. for reference, the timestamp_scale_factor parameters for component01/02/03 were set to 5, 1, and 1.
I then ran the demo using the following command:
  • sh ./run_demo.sh --config request_based_dataflow_example --bootfile `pwd`/artdaq-utilities-daqinterface/simple_test_config/request_based_dataflow_example/boot.txt --comps component01 component03 --runduration 60 --no_om --partition 5
  • Note that I only included components 01 and 03

In this run, the daqlogs/metrics/boardreader log files for component03 showed between 1 and 4 buffers used during the course of the 60-second run.

I then modified component03.fcl so that request_window_offset and _width were both set to zero (instead of 2 and 5, respectively).

In the run with that configuration (using the same run_demo command as before), there were reports of between 1 and 999 buffers used in the component03 metrics file. In addition, there was one "bad omen, buffer is full" message shown in the logfile and TRACE log.
(I'm not sure why the non-zero values for the request_window_offset/width caused the data buffer to be cleaned out more regularly.)

I then modified component03.fcl so that "lazy_mode: true" was included.
For the next test run, the number of used buffers reported in the component03 metrics file was always zero. (The raw data file from this run, and the previous one, had the same size, so I was reassured that this run actually took data.)
I took the absence of buffers being used in this run as a good sign, that is, a sign that the FragmentGenerator was only returning data that would be requested by the EventBuilder.

I also looked at the TRACE log for these runs. I had enabled bit 51 for "ToySimulator" and bits 15, 17, and 20 in component03_CommandableFragmentGenerator.
From the last run, I saw the d51 messages from ToySimulator saying that requests had been received.
From the last run, I saw the "d20 . getDataLoop: getNext_()" messages from CommandableFragmentGenerator saying that a fragment for every 5th timestamp had been returned.
For the previous run, I saw the "d20 . getDataLoop: getNext_()" messages from CommandableFragmentGenerator saying that a fragment for every timestamp had been returned.
This is as expected. With the lazy_mode turned on in the last run, getNext was only supposed to return fragments for the events/timestamps that had been requested. And, only every 5th timestamp was being requested, because those were the only timestamps that were being pushed from component01 to the EventBuilder.

#11 Updated by Kurt Biery 20 days ago

I should have said that the tests in the previous post were done with the code in ToySimulator_generator that tested the GetNextRequest method.

I also ran a series of tests in which I modified ToySimulator_generator to call the GetRequests method, and those tests also showed that the GetRequests functionality allowed the ToySimulator fragment generator to only return fragments that had been requested by the EventBuilder.

#12 Updated by Eric Flumerfelt 20 days ago

I have done a code review for the artdaq-demo branch. Only issue was that the lazily_handled_requests_ set should be cleared at the start of the run, otherwise LAZY_MODEL != 0 will not work for multiple runs.

#13 Updated by Kurt Biery 20 days ago

Excellent, thanks!

#14 Updated by Eric Flumerfelt 20 days ago

  • Status changed from Resolved to Reviewed

The reason that the buffer filled when the offset and width were zeroed is due to unrequested Fragments remaining in the buffer. Circular buffer mode or timeout parameters can remedy this issue.

There are two main items that Fragment generator authors will have to be very aware of in the current implementation:
1. GetNextRequest may behave unpredictably and/or miss requests when requests arrive out-of-order. The implementation should periodically call GetRequests and ensure that there are not stale requests going unanswered
2. GetNextRequest and GetRequests only return the timestamp of the request. Implementations are currently responsible for ensuring that they return data corresponding to the {timestamp - offset, timestamp - offset + width} window if the FG is in Window mode.

#15 Updated by Eric Flumerfelt 18 days ago

  • Target version set to artdaq v3_07_00
  • Status changed from Reviewed to Closed


Also available in: Atom PDF