Task #23726

Dataprep update

Added by David Adams 7 months ago. Updated 4 months ago.

Work in progress
Start date:
Due date:
% Done:


Estimated time:


Dataprep run in production has not kept up with some of the changes I have been using in my studies and I would like update it.

These include:
1. Add the new bad and sticky codes shown at the last couple DRA meetings
2. Switch to the new tail removal
3. Ignore flagged sticky codes in the pedestal finder.

I put these all in one ticket because I would like to change one at a time and verify with the CI testing before proceeding to the next.


#1 Updated by David Adams 7 months ago

  • Assignee set to David Adams
  • Status changed from New to Work in progress

I start with the bad channels. The 17 bad channels show at the Nov 27 DRA meeting ( are already in dunetpc. Before modifying the prolog channelstatus_pdsp in dune/Protodune/singlephase/fcl/channelstatus_pdsp.fcl, I copied it to pdsp_channel_status_2018 in pdsp_channel_status_2018.fcl in
case we want to go back to the old set.

Christoph reported a change between dunetpc revisions 9825e61b and 3f8748a that is presumably due to the new bad channels. We get 0.38% fewer hits in the data CI test presumably because we filter out bad channels prior to wirecell processing.

#2 Updated by David Adams 7 months ago

I have just pushed another bad channel update corresponding to my Dec 11 presentation. Here is a log snippet:

%MSG-i SimpleChannelStatusService:  Early 27-Dec-2019 12:06:07 CST JobSetup
Loaded from configuration:
  - 156 bad channels
  - 39 noisy channels
  - largest channel ID: 15359, largest present: 15359

As before, I would expect small changes in the wire containers in our test jobs. Christoph, please confirm that you see this. Thank you.

This is all the bad channels I know now.

#3 Updated by Christoph Alt 7 months ago

Yes, the datareco_protoDUNESP CI test does report a change in the hit collection size between dunetpc revision d69892ff and 5a96a6ef:

1252: < DecoderandReco | gaushit |  | std::vector<recob::Hit> | 46923

1256: > DecoderandReco | gaushit |  | std::vector<recob::Hit> | 46905

Full log:

#4 Updated by David Adams 7 months ago

Thanks for the report. That small decrease sounds right.

#5 Updated by David Adams 5 months ago

I am back looking at this. I see two places where we define the standard dataprep tool sequences, tools.


I think the definitions are the same there (same names and same sequences). Tingjun, can you explain the purpose of each of these files?

The duplication is not a good idea because one will have to remember to make changes in both places and users like myself are confused about which is actually include in protodune production reco. I propose to move these definitions to a dedicated file, maybe DataPrep/fcl/prototodune_dataprep_services.fcl that is then included in both of the above files.

Any objections?

#6 Updated by David Rivera 5 months ago

Hi David,


is what we use in the refactored simulations for PDSP.

I agree with your suggested change of creating a dedicated file for dataprep to include in both service configuration files. Thanks for noticing this.

#7 Updated by David Adams 4 months ago

I am working on this now. The service configs are being moved to


I am testing that the dataprep results for one data event are unchanged.

I am also moving the simulation configuration protodune_dataprep_tools_sim. Can someone point me to a simulation file I can use to test reco there?



#8 Updated by Tingjun Yang 4 months ago

Here is a MC file:

The standard MC reco fcl file is:

#9 Updated by David Adams 4 months ago

As expected, the new reco looks identical to the old. Compare the new (reco_dataprep.1) and old (reco_dataprep.0) at

I have pushed the mods to take both data and sim from the new config file.

Christoph, when the next CI tests run, could you let us know if any changes appear? Thank you.

#10 Updated by David Adams 4 months ago

Thanks Tingjun. I assume the CI will cover it but I will also do my own check of the sim data using your file.

#11 Updated by David Adams 4 months ago

I confirm the new sim (reco_dataprep_sim.1 at the same URL as above) is the same as the old (reco_dataprep_sim.0).

#12 Updated by David Adams 4 months ago

Changing gears for a minute, I report that PDSP channel 325 (APA 3 u) was added to the 2018 and 2019 noisy channel lists last week. Its pedestal is mostly stuck about 8 ADC counts from occasional forays to the presumably true pedestal position.

A typical waveform can be seen here:

#13 Updated by Christoph Alt 4 months ago

The CI test doesn't show changes in data product sizes.

#14 Updated by David Adams 4 months ago

Christoph, thanks for confirming the last change had no effect.

I have committed another change: the flagging of sticky codes (not the mitigation) is moved to the start of the dataprep tool sequence. This is expected to have no effect on reco and the plots at
indicate this is the case.

To be very careful, I will wail for confirmation no change is seen in the CI tests before moving on to the next change.

#15 Updated by Tingjun Yang 4 months ago

Unfortunately I made another change to hit finder which changed to reconstruction result. Also larsoft version is changed so CI test does not work.

I have backed out my change. Once Christoph updated the dunetpc dependence on larsoft, we can check if David's change has any impact.

#16 Updated by Christoph Alt 4 months ago

The CI test does report a small change in the hit collection and downstream reco:

1259: < DecoderandReco | gaushit | | std::vector<recob::Hit> | 47091
1260: ---
1261: > DecoderandReco | gaushit | | std::vector<recob::Hit> | 47065

This is after moving to larsoft v08_45_00. I don't see changes in the hit finder from larsoft v08_44_00 to larsoft v08_45_00 or other changes in dunetpc that could explain the change in the hit collection, so this is likely due to your commit. You could back your commit out if you want to confirm this.

#17 Updated by Tingjun Yang 4 months ago

Hi David,

Are those changes expected?

#18 Updated by David Adams 4 months ago

No, I did not expect any changes. I will follow Christoph's suggestion and back the change out. --david

#19 Updated by David Adams 4 months ago

I take it back. The pedestal fitter is configured to ignore sticky codes (I forgot about that) and so small changes from the switch are to be expected. The fitter tries to identify and remove sticky codes before fitting but may not always succeed and cannot remove more than one. I believe all is OK now.

#20 Updated by Christoph Alt 4 months ago

I will update the reference files.

Also available in: Atom PDF