Project

General

Profile

Notes on configuration API and DB contents

Introduction

This page is intended to be a useful starting point for discussions on run and dataflow configurations in the artdaq-database package as part of protoDUNE milestone week 2 at CERN.

One of the things that is useful to keep in mind is that different layers of software are possible (and desirable). At the lowest level, we have the specific DB implementations, FileSystemDB, MongoDB, and UConnDB. Above that, we have the existing artdaq-database API. On top of that, we have a possible "convenience layer", which can be used to make access to the database system easier. And finally, we have the experiment-specific APIs at the highest level. This is illustrated in this diagram:

DB_API_Layers.png

The API calls that are listed later on this page and discussed in various conversations may be implemented at different levels in the software stack, depending on how specific they are to a particular experiment and other factors.

Another thing to keep in mind is the different "types" of configuration data. Some examples:
  1. run configurations that are available to be used during data taking
    • users select one as part of configuring the detector and the electronics
  2. a historical archive of the run configurations that were used for each data-taking run
  3. a data set that specifies the components that are available to be used in data taking and other parameters that define the envelope of dataflow configuration possibilities
    • the operator interface uses this data to present options to the user and allow him/her to specify the components and the layout of processes in the system
  4. pre-defined dataflow configurations that can be selected by the user to specify the layout of processes in the system, etc.
  5. run history

Several Example Run Configurations

StandardRunning012 (configuration name)

“rce_standard.fcl” FHiCL configuration string

rce00 FHiCL configuration string

rce01 FHiCL configuration string

EventBuilder FHiCL configuration string

DataLogger FHiCL configuration string

... # suppressed to save space

rce_emulation_mode005 (configuration name)

rce_burst_mode_test001 (configuration name)

Proposed Run Configuration Operations

  1. std::vector<std::string> getListOfAvailableRunConfigurationPrefixes(std::string searchString = "*");
    • returns: "Standard_Running", "rce_emulation_mode", "rce_burst_mode_test" for the configurations listed above
  2. std::map<std::string /*entityName*/, std::string /*FHiCL document*/> getLatestConfiguration(std::string const& configNamePrefix);
    • for now, this call will return a map with 5 entries for the run configurations listed above, and it will be up to the caller to do the full expansion specified by the #include statements. The map keys in those cases will be "rce_standard.fcl", "rce00", "rce01", "EventBuilder", and "DataLogger".
    • the entity names here are typically related to the logical components that can be included in the DAQ, like "rce00", "EventBuilder", and "DataLogger"
  3. bool /* status */ archiveConfiguration(int run_number, std::string configuration_name, std::map<std::string, std::string>);
    • <include an example>
    • the entity names (map keys) here will be different than what is returned in the "getLatestConfig" call above. For example, the ones here might be "BoardReader01" and "EventBuilder01", or "BoardReader::HostA::Port123" and "EventBuilder::HostB::Port456".

Dataflow Configurations

We want to support Operations like the following for dataflow configuration parameters:

  1. std::vector<std::string> getAvailableReadoutComponents()
    • returns lists like "rce00, rce01, ssp00, ssp03, timing_board ..."
  2. <container?> getEventBuildingPossibilities()
  3. <container?> getPossibleDataLoggingLocations()
  4. std::vector<std::string> getListOfAvailableDataFlowConfigurationPrefixes(std::string searchString = "*");
  5. bool /* status */ saveDataflowConfiguration(std::string configuration_name, std::map<std::string, std::string>);
Some notes:
  1. we are proposing to store the parameters that make up the dataflow possibilities as a distinct FHiCL string in the database. This can be thought of as a separate DB record that has well-defined identifiers to mark is as being the one that contains the dataflow possibilities.
  2. distinct from the dataflow possibilities are previously-defined system configurations which might be useful again in the future. These can be named and saved in the database. One proposal is that each selected dataflow configuration does not need to be saved as a named dataflow configuration, but this can certainly be done. In any case, the dataflow configuration for each run will be stored in the run history.
    • for example, there could be a saved dataflow configuration named "standard_system_for_msw2_001" with contents like what is shown in the next bullet
  3. we need to work out how to distinguish available run configurations from available dataflow configurations in the database
Questions to be discussed in the next few weeks:
  1. what language(s) should be supported for the Operations listed above? C++? Python? Bash?
  2. what other parameters might be useful to include in the dataflow configuration?

Run History

We should capture all ideas that people have for parameters that should be stored in run history records. Obvious ones are run number, run start time, run end time, dataflow configuration used, and run configuration used.

On possibility for storing run history in the artdaq-database is to store the information in a single FHiCL string. In this sample, it is assumed that each dataflow configuration that is used is automatically stored in the DB so that it can be referenced here.


Another option is to use a separate database record for each run.