Project

General

Profile

Task #23177

Modify protoDUNE SP dataprep to use the decoder tool instead of reading digits from the event store

Added by David Adams about 2 months ago. Updated about 18 hours ago.

Status:
Work in progress
Priority:
Normal
Assignee:
Start date:
08/27/2019
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Tom has created a tool that decodes protoDUNE raw data and returns it to the caller in the form of vector<raw::RawDigit>. Tingjun has requested that protoDUNE dataprep be modified to use this tool instead of reading digits from the event data store.

I am working on this.

History

#1 Updated by David Adams about 2 months ago

I have committed an update of DataPrepModule that provides the option to take raw digits and their status (RDStatus) from the tool instead of the data store. The configuration prolog producer_adcprep is replaced with producer_adcprep_notool that uses the data store (as before) and producer_adcprep_tool that uses the tool. I have verified that I can drop the TPC decoder producer if the latter configuration is used. The old name points back to the first configuration but we can change this when we decide we prefer the second.

#2 Updated by David Adams about 2 months ago

This seems to work fetching all data before looping over APAs but the memory usage is the same.

Tom indicates we should fetch one APA at a time.

#3 Updated by David Adams 1 day ago

I have created a new module DataPrepByApaModule that processes one APA at a time. It also drops some of the extra baggage added to DataPrepModule. I does allow users to specify which channel groups or ranges to process. Use ChannelGroups: ["all"] for the full detector all at once or ChannelGroups: ["apas"] for the full detector, one APA at a time.

With that option and running dataprep with no tools, I see VM/RSS = 1.32/0.68 GB for the old module or the first option and 1.19/0.55 with the second option, i.e. a saving of 130 MB. Same results for 1 and 10 events.

I also looked at the time spent in the dataprep module to process a single APA again with no tools. The old module (reading in all APAs) requires an average of 6.9 sec/evt and the new require 1.4 sec/event. The last number drops to 1.2 sec/event the second time I process the same batch of events. Presumably this is because that part of the file is cached in memory.

The new module is not yet pushed.

#4 Updated by David Adams about 23 hours ago

Correction: the new module is pushed.

It may not yet be correctly writing output containers. I am working on that.

I propose to write digits and time stamps for all channels in APAs that are decoded and to write wires for all channels that have those after processing. The names for the three containers are configurable and, if any is blank, the corresponding container is not written. Presumably, production will want to write time stamps and wires.

#5 Updated by Thomas Junk about 21 hours ago

Very good!

#6 Updated by David Adams about 19 hours ago

I have pushed some changes that should fix the writing time stamps, digits and wires, all controlled separately with container names. I verified that when I process all but five channels, I get 15360 time stamps, 15360 digits and 15355 wires.

Tingjun or Christoph, I would like to run the CI tests with the new module. I think we need to replace producer_adcprep with producer_adcprep_byapa or I can change the definition of producer_apa in dataprep_dune.fcl. Should I do the latter? It is easy to switch back.

#7 Updated by Tingjun Yang about 18 hours ago

David Adams wrote:

I have pushed some changes that should fix the writing time stamps, digits and wires, all controlled separately with container names. I verified that when I process all but five channels, I get 15360 time stamps, 15360 digits and 15355 wires.

Tingjun or Christoph, I would like to run the CI tests with the new module. I think we need to replace producer_adcprep with producer_adcprep_byapa or I can change the definition of producer_apa in dataprep_dune.fcl. Should I do the latter? It is easy to switch back.

Hi David,

I would suggest to make the change to
dunetpc/fcl/protodune/reco/protoDUNE_reco_data_Dec2018.fcl (line 41)

This will only change the configuration for data reco and leave MC reco unchanged (assuming this is what we want for now).

Thanks,
Tingjun



Also available in: Atom PDF