Project

General

Profile

Notes regarding design of the "mixing module"

We are considering making the "mixing module" be a Producer that can read several "extra" files simultaneously.

If we make the "mixing module" a Producer, we see these ramifications:

  • the products read from the "extra" files and put into the event are labeled as having been created by the mixing module
  • we can't make a general mixing module, able to read all types of product; the mixing module has to declare the types it is to put into the event at the time of its construction.
  • we have to deal with the rigidity of the branch structure created by a module; it has to make an entry into each branch it writes for each event.
  • instance labels can be used to distinguish between different instances of products read from different files.
  • Rob needs the ability to read N events from the extra file and produce a single product from it.
  • Rob needs a "merge information product" for each event, which describes how the different files were read to create a specific event.

Rob's data products:

  GenT <- SimT   Hit1 HitMCT1 Hit2 HitMCT2
           ^              |            |
           |              |            |
            ---------------------------

Tentative Design Proposal

  • Each producer is responsible for handling a single sequence of files, configured via a ParameterSet providing a sequence of file names.
  • We supply a producer template, with one template parameter denoting a user-provided type that satisfies the following requirements:
    • Its c'tor will be handed a ParameterSet.
    • It must have a function produces() that takes a ProductRegistryHelper and calls its produces() once per product to be produced.
    • It must have a function numEventsToRead(), returning a size_t, to be called once on each primary event, giving the number of events to be read from the associated set of files.
  • We assume that we are responsible for reading (from the secondary files) every instance of the product types specified in the ProductRegistryHelper::produces() calls.

For a small increase in complexity (over the above), we can cater to the following scenario: We can separately concatenate products sharing a type and module label. (We would use the ProductInstanceName specified in the produces() call to determine what ModuleLabel is to be sought in the secondary files.)

Questions:

  1. How are the products to be read from the extra file specified? Can we use the combination of (product type/module label/instance name) that is used in Event::getByLabel?
  2. How do you wish to specify the, "event access pattern?" Do you need random access or is sequential access from a particular offset into the file sufficient? Do you need to be able to read 1 event from N files, or M events per N files, or M events from a total of 1 file (per event from the primary source)?
  3. Could you please clarify the exact products you wish to read from each secondary event in the stated example (GenT, SimT, et al.) and whether they will be combined into the existing primary event or added as distinct entities in each case?
Answers from Rob:
  1. I think that the module code that I write will know that. I envisage that my module will make the appropriate calls to tell you want it wants. I don't see this as a configuration issue.
  2. One event from the primary file, sequential access. For the other files, random access: for each event in the primary file, n1 events from file 1, n2 events from file 2 and so on. Here n1 may be fixed for the job or may be drawn from a Poisson distribution on each event; and so on for n2, n3 ...
  3. In the above example, the primary event produces 6 data products. The merge module will produce 6 more data products, in parallel with those from the primary event; they will have the same data types and instance names but differ in the module name. In addition the merge module will create a 7th product that describes its own action ( for examples, the values of n1, n2, n3 ... ).