Project

General

Profile

Bug #3058

FindOne logic error

Added by Andrei Gaponenko about 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Urgent
Category:
Navigation
Target version:
Start date:
10/19/2012
Due date:
% Done:

100%

Estimated time:
Occurs In:
Scope:
Internal
Experiment:
-
SSI Package:
Duration:

Description

Hello,

I have a data product 

   typedef art::Assns<SimParticle, MARSInfo>  SimParticleMARSAssns;

In a writing job         

   for(... i ...) {
      ...
      assns->addSingle(*i, art::Ptr<MARSInfo>(infoPID, info->size()-1, infoGetter));

      cout<<"Added assn: "<<i->key()<<" : "<<info->size()-1<<endl;
   } 

The SimParticle ptrs are taken from a std::set<art::Ptr<SimParticle>>,
so it is guaranteed that a SimParticle can get no more than one
MARSInfo associated. This is confirmed by a printout from the writing
job:

Added assn: 960 : 0
Added assn: 971 : 1
Added assn: 1021 : 2
...
Added assn: 5486 : 49
Added assn: 5497 : 50
Added assn: 5520 : 51

In a reading job, I get an exception when trying to create an

  art::FindOne<MARSInfo> mif(particles, event, marsInfoTag_);

    ---- LogicError BEGIN
      Attempted to create a FindOne object for a one-many or many-many association specified in collection InputTag:  label = SimParticleMARSAssnsMaker, instance = .
      cet::exception going through module ExtMonFNALHitMaker/pixelDigitization run: 1 subRun: 0 event: 1
    ---- LogicError END

Printing out Assns in the reading job:

      art::Handle<SimParticleMARSAssns> ah;
      event.getByLabel(marsInfoModuleLabel_, marsInfoInstanceName_, ah);
      for(SimParticleMARSAssns::assn_iterator i=ah->begin(), iend = ah->end(); i != iend; ++i) {
        std::cout<<"SimParticleMARSAssns: "<<i->first.key()<<"  : "<<i->second.key()<<std::endl;
      }

I get:

SimParticleMARSAssns: 960 : 0
SimParticleMARSAssns: 971 : 1
SimParticleMARSAssns: 1021 : 2
...
SimParticleMARSAssns: 5486 : 49
SimParticleMARSAssns: 5497 : 50
SimParticleMARSAssns: 5520 : 51
SimParticleMARSAssns: 960 : 0
SimParticleMARSAssns: 971 : 1
SimParticleMARSAssns: 1021 : 2
...
SimParticleMARSAssns: 5486 : 49
SimParticleMARSAssns: 5497 : 50
SimParticleMARSAssns: 5520 : 51

that is, the original set of pairs inserted in the writing job is repeated twice.

Why is FindOne not working?
Andrei

Associated revisions

Revision 96220a5f (diff)
Added by Christopher Green about 7 years ago

Fix issue #3058: duplication of Assns with multiple output files.

Revision 21b92603 (diff)
Added by Christopher Green about 7 years ago

Fix issue #3058: duplication of Assns with multiple output files.

Revision 6e325fe9 (diff)
Added by Christopher Green about 7 years ago

Fix issue #3058: duplication of Assns with multiple output files.

History

#1 Updated by Christopher Green about 7 years ago

  • Category set to Navigation
  • Status changed from New to Assigned
  • Assignee set to Christopher Green

Hi Andrei,

Please attach a tarball of your Offline subdirectory, or tell me whence I can copy it. I'll also need instructions on how to duplicate the problem and details of exactly how you're setting up the code (1.00.08 vs 1.00.11, for example).

Thanks,
Chris.

#2 Updated by Andrei Gaponenko about 7 years ago

Hi Chris,

This is with art v1.00.11. I'll post again the tarball is ready.

Andrei

#3 Updated by Andrei Gaponenko about 7 years ago

I reproduced the problematic write/read chain on detsim, in

/mu2e/app/users/gandr/FindOneLogicError/

The Offline is the code build. Set up with

source /grid/fermiapp/products/mu2e/setupmu2e-art.sh
source /mu2e/app/users/gandr/FindOneLogicError/Offline/setup_g4951.sh

The /mu2e/app/users/gandr/FindOneLogicError/run
subdirectory contains outputs from the writer job

/usr/bin/time mu2e -c ./g4s2_emfMARSBox.fcl > testlog-g4s2.log 2>&1 &

and the reader

/usr/bin/time mu2e -c ./digiFilteredSim.fcl vdg4s2EMFMARSBoxFiltered.root > testlog-digi.log 2>&1 &

the latter has the exception.

Andrei

#4 Updated by Christopher Green about 7 years ago

Hi,

Please reduce the .fcl files to the minimum needed to reproduce the result: when executing the second job I got:

<snip/>
Begin processing the 1st record. run: 1 subRun: 0 event: 1 at 20-Oct-2012 15:02:42 CDT
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 68, func apply: here
SimParticleMARSAssns:  ( 38 : 0 )
SimParticleMARSAssns:  ( 96 : 1 )
SimParticleMARSAssns:  ( 103 : 2 )
SimParticleMARSAssns:  ( 128 : 3 )
SimParticleMARSAssns:  ( 169 : 4 )
SimParticleMARSAssns:  ( 286 : 5 )
SimParticleMARSAssns:  ( 414 : 6 )
SimParticleMARSAssns:  ( 426 : 7 )
SimParticleMARSAssns:  ( 478 : 8 )
SimParticleMARSAssns:  ( 496 : 9 )
SimParticleMARSAssns:  ( 567 : 10 )
SimParticleMARSAssns:  ( 597 : 11 )
SimParticleMARSAssns:  ( 609 : 12 )
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 78, func apply: here
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 80, func apply: here
Applying ProtonPulseShape in pixel digitization. marsMode=1, pulseHalfWidth=100
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 146, func getPrimaryMARSId: primary sim=128, pdgId=2212
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 151, func getPrimaryMARSId: primary sim=128, pdgId=2212
AG: ExtinctionMonitorFNAL/Digitization/src/ProtonPulseShape.cc, line 153, func getPrimaryMARSId: here
%MSG-s ArtException:  PostPathEndRun eprint PostEndRun
cet::exception caught in art
---- EventProcessorFailure BEGIN
  An exception occurred during current event processing
  ---- ScheduleExecutionFailure BEGIN
    ProcessingStopped.

    ---- StdException BEGIN
      A std::exception occurred during a call to the module ExtMonFNALHitMaker/pixelDigitization run: 1 subRun: 0 event: 1
      and cannot be repropagated.
      Previous information:
      vector::_M_range_check
    ---- StdException END
    Exception going through path tprint
  ---- ScheduleExecutionFailure END
  cet::exception caught in EventProcessor and rethrown
  ------------------------------------------------------------
  Another exception was caught while trying to clean up files after
  the primary exception.  We give up trying to clean up files at
  this point.  The description of this additional exception follows:
  cet::exception
  ---- FatalRootError BEGIN
    Fatal Root Error: @SUB=TTree::SetEntries
    Tree branches have different numbers of entries, with 5 maximum.
  ---- FatalRootError END
---- EventProcessorFailure END
In other words, something hinky is going on. The initial printout of the contents of the assns looks perfectly fine though, and is consistent with the output from the producer job:
<snip/>
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 38 : 0
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 96 : 1
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 103 : 2
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 128 : 3
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 169 : 4
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 286 : 5
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 414 : 6
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 426 : 7
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 478 : 8
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 496 : 9
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 567 : 10
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 597 : 11
AG: ExtinctionMonitorFNAL/TruthAlgs/src/SimParticleMARSAssnsMaker_module.cc, line 114, func produce: Added assn: 609 : 12

Thanks,
Chris.

#5 Updated by Andrei Gaponenko about 7 years ago

All right, a simplified version is in /mu2e/app/users/gandr/FindOneLogicError/run2

Rerunning the same exact commands

/usr/bin/time mu2e -c ./g4s2_emfMARSBox.fcl > testlog-g4s2.log 2>&1

/usr/bin/time mu2e -c ./digiFilteredSim.fcl vdg4s2EMFMARSBoxFiltered.root > testlog-digi.log 2>&1

I get the same result as before, with the LogicError.

The range exception you've got is related to a simplification I did after running the first jobs and before making the previous post. Writing out just outFiltered gives range error in the reading job, writing out both outFiltered and outFull gives LogicError.

Andrei

#6 Updated by Christopher Green about 7 years ago

  • Status changed from Assigned to Resolved
  • Target version set to 1.02.05
  • % Done changed from 0 to 100

The cause of this issue has been identified and fixed on the HEAD with 21b92603f86643f0125e393372068c46ddba8bd7 and on v1_00_11-fixes with 1d2fc6213042ba14a01f74e75a14eeefcf805192. A regression test has been put in place with 9e2c9245220c4746df7a53ef16a18dbc99cce263 and 8e6757c93b749f53f18dd2aec872eadf78d587d0 respectively.

The custom streamer for the Assns class was not protected against the possibility of multiple calls to write, which happens with multiple output streams. If the configuration specifies multiple output modules, then each Assns object will have as many copies of its data as its output module's place in line. A simple workaround in the absence of a release (although not necessarily a convenient one) would be to have only one output module where you have producers creating at least one Assns product.

We apologize for the problems caused by this bug.

#7 Updated by Christopher Green about 7 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF