Project

General

Profile

Bug #22716

LArG4/LArVoxelReadout (larsoft v06_61_00 / larsim v06_38_01) gets stuck in an endless loop until available RAM is exhausted

Added by Johnny Ho 3 months ago. Updated 3 months ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
Simulation
Target version:
-
Start date:
06/11/2019
Due date:
% Done:

0%

Estimated time:
Occurs In:
Experiment:
LArIAT
Co-Assignees:
Duration:

Description

Hello!

We are trying to produce a sample of data-driven Monte Carlo in LArIAT as part of our MC production campaign, and we are running into an issue with the simulation in larsoft v06_61_00 / larsim v06_38_01. We are using the TextFileGen module for generating our MC events with beam halo pile-up muons that enter the front of the TPC and through the cathode side of the TPC. We are able to generate events when we only have pile-up muons that enter the front of the TPC. However, when we add pile-up muons that enter through the cathode side of the TPC, our jobs get stuck in an endless loop until the available RAM is exhausted. Here is an excerpt from one of our log files (/lariat/data/users/johnnyho/larvoxelreadout_bug/log):

%MSG-w LArVoxelReadout:  LArG4:largeant@ 11-Jun-2019 09:08:28 CDT  run: 1 subRun: 0 event: 4
unable to drift electrons from point (7.7946,-18.8803,89.97) with exception ---- Geometry BEGIN
  Can't find nearest wire for position (-0.2,-18.8417,90.0442) in plane C:0 T:0 P:0 approx wire number # 239 (capped from 240)
---- Geometry END

%MSG
%MSG-w LArVoxelReadout:  LArG4:largeant@ 11-Jun-2019 09:08:28 CDT  run: 1 subRun: 0 event: 4
unable to drift electrons from point (7.82344,-18.8206,89.97) with exception ---- Geometry BEGIN
  Can't find nearest wire for position (-0.2,-18.8225,90.0841) in plane C:0 T:0 P:0 approx wire number # 239 (capped from 240)
---- Geometry END

%MSG
%MSG-w LArVoxelReadout:  LArG4:largeant@ 11-Jun-2019 09:08:28 CDT  run: 1 subRun: 0 event: 4
unable to drift electrons from point (7.82267,-18.8212,89.9925) with exception ---- Geometry BEGIN
  Can't find nearest wire for position (-0.2,-18.7994,90.0676) in plane C:0 T:0 P:0 approx wire number # 239 (capped from 240)
---- Geometry END

  ... 

%MSG
%MSG-w LArVoxelReadout:  LArG4:largeant@ 11-Jun-2019 11:07:38 CDT  run: 1 subRun: 0 event: 4
unable to drift electrons from point (3.66917,-19.575,89.9451) with exception ---- Geometry BEGIN
  Can't find nearest wire for position (-0.2,-19.5978,89.9533) in plane C:0 T:0 P:0 approx wire number # 239 (capped from 240)
---- Geometry END

It seems like the simulation is having trouble with charged particles that pass through the edges and/or corners of the active volume? (Active volume of the LArIAT TPC: 0 cm < x < 47.5 cm, −20 cm < y 20 cm, 0 cm < z < 90 cm.)

We've submitted 500 jobs to the grid with 200 events per job, and every single job gets held after running out of memory from these endless loops. (All the jobs finish just fine when we don't include beam halo pile-up muons that enter through the cathode side of the TPC.)

Here are the steps to reproduce the problem:

source /grid/fermiapp/larsoft/products/setup
export PRODUCTS=/grid/fermiapp/products/lariat:${PRODUCTS}:/grid/fermiapp/products/common/db
setup lariatsoft v06_61_00_03 -q e14:prof

lar -c /lariat/data/users/johnnyho/larvoxelreadout_bug/mc_gen_run2_negative_100a.fcl

History

#1 Updated by Gianluca Petrillo 3 months ago

Not really my business, but I think it would help a bit if you could take the extra step of generating a ROOT file with TextGen (no LArG4), find the event where it get stuck, and report the location of the file, the number of the event in that file, and a simple LArG4 FHiCL configuration to see that event stuck.

#2 Updated by Gianluca Petrillo 3 months ago

Also, can you swear on the correctness of the GDML description? what you describe might be caused by a particle trying to cross where volumes overlap: GEANT4 can verify that there are no overlaps (ROOT can too, but it uses different infrastructure and I would rather ask GEANT4 in this case).

#3 Updated by Johnny Ho 3 months ago

  • Priority changed from High to Low
  • Status changed from New to Closed

This was actually a stupid mistake on my part. Closing the issue.



Also available in: Atom PDF