Project

General

Profile

Bug #18943

Problem with sumdata::RunData aggregation

Added by Tingjun Yang over 2 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Data products
Target version:
-
Start date:
02/09/2018
Due date:
% Done:

100%

Estimated time:
1.00 h
Spent time:
Occurs In:
Experiment:
LArSoft
Co-Assignees:
Duration:

Description

Dear LArSoft experts,

We are having trouble reading all the DUNE/ProtoDUNE MC files produced with larsoft v06_60_00 in the latest release (v06_67_01) probably due to recent upgrade of art.
lar -c eventdump.fcl /pnfs/dune/tape_backed/dunepro/mcc10/mc/full-reconstructed/02/04/78/50/mcc10_protodune_beam_p1GeV_cosmics_3ms_28_20171228T124233_merged0.root

%MSG-s ArtException:  PostEndJob 09-Feb-2018 11:00:30 CST ModuleEndJob
cet::exception caught in art
---- OtherArt BEGIN
  ---- ProductCannotBeAggregated BEGIN
    Products of type "sumdata::RunData" cannot be aggregated.
    Please contact artists@fnal.gov.
  ---- ProductCannotBeAggregated END
---- OtherArt END
%MSG

There is a workaround to drop sumdata from input:

source:
{
  module_type: RootInput
  inputCommands: [ "keep *", "drop sumdata::RunData_*_*_*" ]
}

But we don't want to patch all the fcl files in order to analyse the MCC10 files.

Thanks,
Tingjun


Related issues

Related to art - Support #18971: Products of type "sumdata::RunData" cannot be aggregated.Closed02/12/2018

Related to LArSoft - Necessary Maintenance #18997: Verify that the run and subrun level data products are written during the proper art phaseClosed02/14/2018

Related to canvas - Bug #18996: Product aggregation logic does not correctly act on full-run range productsClosed02/14/2018

History

#1 Updated by Alexander Himmel over 2 years ago

It looks like the real fix here is define an aggregate() method for this data product so that ART knows how to merge these products across runs. This appears to be a new requirement with the most recent version of ART.

https://cdcvs.fnal.gov/redmine/projects/art/wiki/Product_aggregation_details

#2 Updated by Gianluca Petrillo over 2 years ago

  • Category set to Data products
  • Status changed from New to Assigned
  • Assignee set to Gianluca Petrillo
  • % Done changed from 0 to 80
  • Estimated time set to 1.00 h
  • Occurs In v06_67_00 added

It seems Alex has hit the nail.
I have added an sumdata::POTSummary::aggregate() method that adds all the fields of the data product.
Tingjun, could you test branch feature/gp_Issue18943 of larcoreobj to see if the issue is addressed?

#3 Updated by Tingjun Yang over 2 years ago

Hi Gianluca,

I got the following messages when I tried to compile my local release:

[546/1955] Linking CXX executable larreco/bin/GausFitCache_test
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RecoBase.so, not found (try using -rpath or -rpath-link)
[555/1955] Linking CXX executable larreco/bin/HitAnaAlg_test
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[889/1955] Linking CXX executable dunetpc/bin/test_LarsoftHuffmanCompressService
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[960/1955] Linking CXX executable dunetpc/bin/test_AcdWireReader
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[972/1955] Linking CXX executable dunetpc/bin/test_AcdDigitReader
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[979/1955] Linking CXX executable dunetpc/bin/test_AdcRoiViewer
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[1017/1955] Linking CXX executable dunetpc/bin/test_StandardAdcWireBuildingService
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[1029/1955] Linking CXX executable dunetpc/bin/test_StandardRawDigitPrepService
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)
[1030/1955] Linking CXX executable dunetpc/bin/test_StandardRawDigitExtractService
/usr/bin/ld: warning: liblarcoreobj_SummaryData.so, needed by /cvmfs/fermilab.opensciencegrid.org/products/larsoft/lardataobj/v1_28_00/slf6.x86_64.e15.prof/lib/liblardataobj_RawData.so, not found (try using -rpath or -rpath-link)

and some similar warnings in other repositories. However, the compiling finished. I got the same error when running eventdump.fcl on the file.

#4 Updated by Alexander Himmel over 2 years ago

Credit really goes to Chris Backhouse. He spotted the fix.

#5 Updated by Kyle Knoepfel over 2 years ago

  • Related to Support #18971: Products of type "sumdata::RunData" cannot be aggregated. added

#6 Updated by Gianluca Petrillo over 2 years ago

Tingjun Yang wrote:

Hi Gianluca,

I got the following messages when I tried to compile my local release:
[...]

and some similar warnings in other repositories. However, the compiling finished. I got the same error when running eventdump.fcl on the file.

I can't imagine the cause of that... have you walked the annoying path of zapping (mrb z) and starting from a new shell?

And, more to the point: could you run the test, or did this warning prevent it from running?

#7 Updated by Tingjun Yang over 2 years ago

Hi Gianluca,

I tried again from scratch and ran into the same problem. I was able to run the test and got the same error:

%MSG-s ArtException:  PostEndJob 12-Feb-2018 13:09:06 CST ModuleEndJob
cet::exception caught in art
---- OtherArt BEGIN
  ---- ProductCannotBeAggregated BEGIN
    Products of type "sumdata::RunData" cannot be aggregated.
    Please contact artists@fnal.gov.
  ---- ProductCannotBeAggregated END
---- OtherArt END
%MSG
Art has completed and will exit with status 1.

Here is my local release if you are interested: /dune/app/users/tjyang/larsoft_mydev/

Tingjun

#8 Updated by Gianluca Petrillo over 2 years ago

My bad there... I did not read the error well enough, and my brain was convinced sumdata::RunData would not be a problem.
I have provided an aggregate() method for that class too, which throws an exception if we try to mix two different detector names (that is, two different geometries) in the same run.
I hope this approach not to be a problem in MonteCarlo.
Code is in the same branch feature/gp_Issue18943 of larcoreobj. Could you give it a try again?

#9 Updated by Gianluca Petrillo over 2 years ago

  • Status changed from Assigned to Work in progress

#10 Updated by Tingjun Yang over 2 years ago

Yeah, I confirm the problem is resolved. Thank you Gianluca.

#11 Updated by Gianluca Petrillo over 2 years ago

  • Status changed from Work in progress to Resolved
  • % Done changed from 80 to 100
  • Experiment LArSoft added
  • Experiment deleted (DUNE)

This change may be subtly breaking, so I will ask the feature branch to be integrated in the coming release.
It should not take long. In the meanwhile, you can use the feature branch directly.

#12 Updated by Kyle Knoepfel over 2 years ago

To Alex's comment: this is not a requirement with the most recent version of art. This has been a requirement since art 2.01.01. It's just that your current use of art has just now exposed this requirement.

#13 Updated by Lynn Garren over 2 years ago

This fix is in the larsoft v06_68_00 release.

#14 Updated by Gianluca Petrillo over 2 years ago

#15 Updated by Gianluca Petrillo over 2 years ago

Kyle Knoepfel wrote:

To Alex's comment: this is not a requirement with the most recent version of art. This has been a requirement since art 2.01.01. It's just that your current use of art has just now exposed this requirement.

After some analysis, Kyle found that the change that suddenly exposed this problem was the resolution of issue #18384 in 2.09.03, also merged as patch version 2.05.01.
These are used for LArSoft versions v06_65_00 to date, and for versions v06_26_01_09 and v06_26_01_10 (see LArSoft release list page).

#16 Updated by Gianluca Petrillo over 2 years ago

An additional branch gp_Issue18943_for_v06_26_01_01 has been provided for merge into branch v06_26_01_01_branch.
This is needed for patch releases cut from the latter branch.

#17 Updated by Kyle Knoepfel over 2 years ago

  • Related to Bug #18996: Product aggregation logic does not correctly act on full-run range products added

#18 Updated by Gianluca Petrillo over 2 years ago

Tingjun, please make sure the input file that you mention here is left available for a while, or otherwise provide instructions to pick it from tape or wherever it's stored.
We may want to test with it the fix art will push for issue #18966.

#19 Updated by Tingjun Yang over 2 years ago

The file is on tape so will be there for a long time.

#20 Updated by Gianluca Petrillo over 2 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF