Notes from May 2011 CD1 Design Review

09-May-2011, KAB - Here are some notes from the review that took place last week (03-May thru 05-May).

The official review web site is here.

Slides from the DAQ breakout talks are documents 1532, 1533, and 1546 in the Mu2e DocDB.

Kurt's notes from the DAQ breakout (added 20-May):
  1. There were questions on how slow controls for the beamline (presumably the portion that is part of the experiment) will be handled. For example, how will beam tuning within the experiment be done?
  2. Questions were asked about how the system will deal with situations in which the data rate is too high for the DAQ to keep up. [In my opinion, it would be great to define this behavior early.] What happens when front-end buffers fill up?
    • if we run without zero suppression turned on, how badly would that clog the system?
  3. We should think about how partitioning will be handled in each of the architecture options. For example, if event building is done in FPGAs, what granularity could be used for partitioning?
  4. Jonathan suggested that we design in network security from the beginning. Tradeoffs between remote access and network isolation. Have a dedicated gateway node that has no other responsibilities within the system.
  5. Possible risks to consider: FPGAs will get better; hardware selected early may face end-of-life issues earlier than desirable in the lifetime of the experiment.
  6. What calibrations will be needed? What protocols will be used to read out the data for them? [We should document what we know of so far.] Pedestals?
  7. What is the offline requirement for the sequence of events in the files that are written?
  8. It would be great to have a way to query FPGAs for the version of the firmware that they are running.
  9. Possible risks: incomplete documentation for finished products; the reconstruction code can't meet the necessary rejection with the provided processor power; disruptions causes by operating system upgrades.
    • the online trigger software system could be designed from the beginning to be a stripped down version of the offline code. runtime switches could be used to select between modes.
    • when new OS releases are deployed, how will we validate that the physics results of the reconstruction code don't change?
  10. I think that it would be nice to have a document in DocDB that captures as much of the DAQ software plan as we know that this point. (In principle, this information is captured in the WBS dictionary, but that doesn't seem very accessible.) In preparing the slides for this review, I realized that some of the initial assignments of work to a given WBS component and phase may no longer be correct. In some cases, we're thinking of doing various tasks earlier or later than originally though.
The official review closeout had the following recommendations for Trigger and DAQ:
  1. We should quantify the reduction factor that is needed in the online software trigger. (As part of this, I think that we should define what we mean by an "event", both before and after the software trigger has been run on the data. And, when we quantify the reduction factor in the software trigger, we should be clear on whether the needed reduction is in terms of events, data volume, both, or some other parameter.)
  2. We should define the potential (software) trigger algorithms and estimate the performance and computing requirements for each.
  3. We should define the interface between the experiment slow controls system and the beam-line monitoring components.
  4. We should review all data-taking modes (calibration, non-zero-suppressed, etc.) to confirm that the planned DAQ architecture can collect all data formats at the required rates. (What beam-on and beam-off calibrations are planned? What form would an "event" take in the case of a beam-off laser calibration run?)

In a private conversation after the review, Leon expressed some amount of concern about doing the event building with FPGAs. One concern was that detector components that are added to the plan late in the construction period might be more easily incorporated into the DAQ when the event building is done in software.