Support #17715
Large memory usage in SpacePointSolver for MicroBooNE
0%
Description
Dear LArSoft experts,
Chris Backhouse's SpacePointSolver has been recently added to larreco. This module produces 3D points using hits as input. However, this module uses large amount of memory for a MicroBooNE job. Here is the command to reproduce the problem:
lar -c reco3djob_uboone.fcl /uboone/data/users/vmeddage/test_trajcluster.root --nskip 40 -n 2
One needs to use the fcl file from the develop HEAD.
Here are the details from the MemoryTrack db using Marc Paterno's R package 'artsupport':
> head(mods,16) # A tibble: 16 x 9 Step Run SubRun Event Path ModuleLabel ModuleType Vsize RSS <chr> <int> <int> <int> <chr> <chr> <chr> <dbl> <dbl> 1 PreProcessModule 1 27 1341 reco reco3d SpacePointSolver 1687.863 617.4679 2 PostProcessModule 1 27 1341 reco reco3d SpacePointSolver 2327.364 1255.7189 3 PreProcessModule 1 27 1341 reco TriggerResults TriggerResultInserter 2327.364 1255.7189 4 PostProcessModule 1 27 1341 reco TriggerResults TriggerResultInserter 2327.364 1255.7189 5 PreProcessModule 1 27 1341 end_path out1 RootOutput 2327.364 1255.7189 6 PostProcessModule 1 27 1341 end_path out1 RootOutput 2327.364 1255.7189 7 PreWriteEvent 1 27 1341 end_path out1 RootOutput 2327.364 1257.1279 8 PostWriteEvent 1 27 1341 end_path out1 RootOutput 4054.274 2972.6188 9 PreProcessModule 1 27 1342 reco reco3d SpacePointSolver 4054.274 2973.4052 10 PostProcessModule 1 27 1342 reco reco3d SpacePointSolver 4547.080 3460.8374 11 PreProcessModule 1 27 1342 reco TriggerResults TriggerResultInserter 4547.080 3460.8374 12 PostProcessModule 1 27 1342 reco TriggerResults TriggerResultInserter 4547.080 3460.8374 13 PreProcessModule 1 27 1342 end_path out1 RootOutput 4547.080 3460.8374 14 PostProcessModule 1 27 1342 end_path out1 RootOutput 4547.080 3460.8374 15 PreWriteEvent 1 27 1342 end_path out1 RootOutput 4547.080 3460.8374 16 PostWriteEvent 1 27 1342 end_path out1 RootOutput 4547.080 3462.8731
We do not see such an increase of memory usage for a similar DUNE job. We were also able to rule out memory leak using the tool massif with help from Gianluca.
We appreciate any help in understanding and solving this issue.
Tingjun
History
#1 Updated by Tingjun Yang over 3 years ago
Here is the memory usage for 10 events without writing art output (--no-output):
Step Run SubRun Event Path ModuleLabel ModuleType Vsize RSS <chr> <int> <int> <int> <chr> <chr> <chr> <dbl> <dbl> 1 PreProcessModule 1 27 1341 reco reco3d SpacePointSolver 1685.459 621.7441 2 PostProcessModule 1 27 1341 reco reco3d SpacePointSolver 2324.419 1259.0203 3 PreProcessModule 1 27 1341 reco TriggerResults TriggerResultInserter 2324.419 1259.0203 4 PostProcessModule 1 27 1341 reco TriggerResults TriggerResultInserter 2324.419 1259.0203 5 PreProcessModule 1 27 1342 reco reco3d SpacePointSolver 1863.377 799.8628 6 PostProcessModule 1 27 1342 reco reco3d SpacePointSolver 2824.487 1756.4959 7 PreProcessModule 1 27 1342 reco TriggerResults TriggerResultInserter 2824.487 1756.4959 8 PostProcessModule 1 27 1342 reco TriggerResults TriggerResultInserter 2824.487 1756.4959 9 PreProcessModule 1 27 1343 reco reco3d SpacePointSolver 2154.320 1090.8099 10 PostProcessModule 1 27 1343 reco reco3d SpacePointSolver 2162.250 1097.3594 11 PreProcessModule 1 27 1343 reco TriggerResults TriggerResultInserter 2162.250 1097.3594 12 PostProcessModule 1 27 1343 reco TriggerResults TriggerResultInserter 2162.250 1097.3594 13 PreProcessModule 1 27 1344 reco reco3d SpacePointSolver 2162.250 1097.3594 14 PostProcessModule 1 27 1344 reco reco3d SpacePointSolver 2263.466 1198.1169 15 PreProcessModule 1 27 1344 reco TriggerResults TriggerResultInserter 2263.466 1198.1169 16 PostProcessModule 1 27 1344 reco TriggerResults TriggerResultInserter 2263.466 1198.1169 17 PreProcessModule 1 27 1345 reco reco3d SpacePointSolver 2154.320 1090.9409 18 PostProcessModule 1 27 1345 reco reco3d SpacePointSolver 2451.575 1387.4422 19 PreProcessModule 1 27 1345 reco TriggerResults TriggerResultInserter 2451.575 1387.4422 20 PostProcessModule 1 27 1345 reco TriggerResults TriggerResultInserter 2451.575 1387.4422 21 PreProcessModule 1 27 1346 reco reco3d SpacePointSolver 2192.134 1128.6241 22 PostProcessModule 1 27 1346 reco reco3d SpacePointSolver 3922.612 2856.8945 23 PreProcessModule 1 27 1346 reco TriggerResults TriggerResultInserter 3922.612 2856.8945 24 PostProcessModule 1 27 1346 reco TriggerResults TriggerResultInserter 3922.612 2856.8945 25 PreProcessModule 1 27 1347 reco reco3d SpacePointSolver 2291.192 1227.6818 26 PostProcessModule 1 27 1347 reco reco3d SpacePointSolver 2431.361 1366.0365 27 PreProcessModule 1 27 1347 reco TriggerResults TriggerResultInserter 2431.361 1366.0365 28 PostProcessModule 1 27 1347 reco TriggerResults TriggerResultInserter 2431.361 1366.0365 29 PreProcessModule 1 27 1348 reco reco3d SpacePointSolver 2338.619 1275.1094 30 PostProcessModule 1 27 1348 reco reco3d SpacePointSolver 2400.309 1333.7887 31 PreProcessModule 1 27 1348 reco TriggerResults TriggerResultInserter 2400.309 1333.7887 32 PostProcessModule 1 27 1348 reco TriggerResults TriggerResultInserter 2400.309 1333.7887 33 PreProcessModule 1 27 1349 reco reco3d SpacePointSolver 2333.266 1269.8870 34 PostProcessModule 1 27 1349 reco reco3d SpacePointSolver 2333.807 1270.2966 35 PreProcessModule 1 27 1349 reco TriggerResults TriggerResultInserter 2333.807 1270.2966 36 PostProcessModule 1 27 1349 reco TriggerResults TriggerResultInserter 2333.807 1270.2966 37 PreProcessModule 1 27 1350 reco reco3d SpacePointSolver 2333.807 1270.2966 38 PostProcessModule 1 27 1350 reco reco3d SpacePointSolver 2400.985 1335.3124 39 PreProcessModule 1 27 1350 reco TriggerResults TriggerResultInserter 2400.985 1335.3124 40 PostProcessModule 1 27 1350 reco TriggerResults TriggerResultInserter 2400.985 1335.3124
It seems the memeory increase seen before was associated with RootOutput. Sometimes reco3d does use a lot of memory (e.g. event 1346).
#2 Updated by Lynn Garren over 3 years ago
- Status changed from New to Assigned
- Assignee set to Christopher Backhouse
Chris, would you take a look at this? Let us know if you need help.