Project

General

Profile

Support #17715

Large memory usage in SpacePointSolver for MicroBooNE

Added by Tingjun Yang about 3 years ago. Updated about 3 years ago.

Status:
Assigned
Priority:
Normal
Category:
-
Target version:
-
Start date:
09/14/2017
Due date:
% Done:

0%

Estimated time:
Experiment:
MicroBooNE
Co-Assignees:
Duration:

Description

Dear LArSoft experts,

Chris Backhouse's SpacePointSolver has been recently added to larreco. This module produces 3D points using hits as input. However, this module uses large amount of memory for a MicroBooNE job. Here is the command to reproduce the problem:

lar -c reco3djob_uboone.fcl /uboone/data/users/vmeddage/test_trajcluster.root --nskip 40 -n 2

One needs to use the fcl file from the develop HEAD.
Here are the details from the MemoryTrack db using Marc Paterno's R package 'artsupport':
> head(mods,16)
# A tibble: 16 x 9
                Step   Run SubRun Event     Path    ModuleLabel            ModuleType    Vsize       RSS
               <chr> <int>  <int> <int>    <chr>          <chr>                 <chr>    <dbl>     <dbl>
 1  PreProcessModule     1     27  1341     reco         reco3d      SpacePointSolver 1687.863  617.4679
 2 PostProcessModule     1     27  1341     reco         reco3d      SpacePointSolver 2327.364 1255.7189
 3  PreProcessModule     1     27  1341     reco TriggerResults TriggerResultInserter 2327.364 1255.7189
 4 PostProcessModule     1     27  1341     reco TriggerResults TriggerResultInserter 2327.364 1255.7189
 5  PreProcessModule     1     27  1341 end_path           out1            RootOutput 2327.364 1255.7189
 6 PostProcessModule     1     27  1341 end_path           out1            RootOutput 2327.364 1255.7189
 7     PreWriteEvent     1     27  1341 end_path           out1            RootOutput 2327.364 1257.1279
 8    PostWriteEvent     1     27  1341 end_path           out1            RootOutput 4054.274 2972.6188
 9  PreProcessModule     1     27  1342     reco         reco3d      SpacePointSolver 4054.274 2973.4052
10 PostProcessModule     1     27  1342     reco         reco3d      SpacePointSolver 4547.080 3460.8374
11  PreProcessModule     1     27  1342     reco TriggerResults TriggerResultInserter 4547.080 3460.8374
12 PostProcessModule     1     27  1342     reco TriggerResults TriggerResultInserter 4547.080 3460.8374
13  PreProcessModule     1     27  1342 end_path           out1            RootOutput 4547.080 3460.8374
14 PostProcessModule     1     27  1342 end_path           out1            RootOutput 4547.080 3460.8374
15     PreWriteEvent     1     27  1342 end_path           out1            RootOutput 4547.080 3460.8374
16    PostWriteEvent     1     27  1342 end_path           out1            RootOutput 4547.080 3462.8731

We do not see such an increase of memory usage for a similar DUNE job. We were also able to rule out memory leak using the tool massif with help from Gianluca.

We appreciate any help in understanding and solving this issue.

Tingjun

History

#1 Updated by Tingjun Yang about 3 years ago

Here is the memory usage for 10 events without writing art output (--no-output):

                Step   Run SubRun Event  Path    ModuleLabel            ModuleType    Vsize       RSS
               <chr> <int>  <int> <int> <chr>          <chr>                 <chr>    <dbl>     <dbl>
 1  PreProcessModule     1     27  1341  reco         reco3d      SpacePointSolver 1685.459  621.7441
 2 PostProcessModule     1     27  1341  reco         reco3d      SpacePointSolver 2324.419 1259.0203
 3  PreProcessModule     1     27  1341  reco TriggerResults TriggerResultInserter 2324.419 1259.0203
 4 PostProcessModule     1     27  1341  reco TriggerResults TriggerResultInserter 2324.419 1259.0203
 5  PreProcessModule     1     27  1342  reco         reco3d      SpacePointSolver 1863.377  799.8628
 6 PostProcessModule     1     27  1342  reco         reco3d      SpacePointSolver 2824.487 1756.4959
 7  PreProcessModule     1     27  1342  reco TriggerResults TriggerResultInserter 2824.487 1756.4959
 8 PostProcessModule     1     27  1342  reco TriggerResults TriggerResultInserter 2824.487 1756.4959
 9  PreProcessModule     1     27  1343  reco         reco3d      SpacePointSolver 2154.320 1090.8099
10 PostProcessModule     1     27  1343  reco         reco3d      SpacePointSolver 2162.250 1097.3594
11  PreProcessModule     1     27  1343  reco TriggerResults TriggerResultInserter 2162.250 1097.3594
12 PostProcessModule     1     27  1343  reco TriggerResults TriggerResultInserter 2162.250 1097.3594
13  PreProcessModule     1     27  1344  reco         reco3d      SpacePointSolver 2162.250 1097.3594
14 PostProcessModule     1     27  1344  reco         reco3d      SpacePointSolver 2263.466 1198.1169
15  PreProcessModule     1     27  1344  reco TriggerResults TriggerResultInserter 2263.466 1198.1169
16 PostProcessModule     1     27  1344  reco TriggerResults TriggerResultInserter 2263.466 1198.1169
17  PreProcessModule     1     27  1345  reco         reco3d      SpacePointSolver 2154.320 1090.9409
18 PostProcessModule     1     27  1345  reco         reco3d      SpacePointSolver 2451.575 1387.4422
19  PreProcessModule     1     27  1345  reco TriggerResults TriggerResultInserter 2451.575 1387.4422
20 PostProcessModule     1     27  1345  reco TriggerResults TriggerResultInserter 2451.575 1387.4422
21  PreProcessModule     1     27  1346  reco         reco3d      SpacePointSolver 2192.134 1128.6241
22 PostProcessModule     1     27  1346  reco         reco3d      SpacePointSolver 3922.612 2856.8945
23  PreProcessModule     1     27  1346  reco TriggerResults TriggerResultInserter 3922.612 2856.8945
24 PostProcessModule     1     27  1346  reco TriggerResults TriggerResultInserter 3922.612 2856.8945
25  PreProcessModule     1     27  1347  reco         reco3d      SpacePointSolver 2291.192 1227.6818
26 PostProcessModule     1     27  1347  reco         reco3d      SpacePointSolver 2431.361 1366.0365
27  PreProcessModule     1     27  1347  reco TriggerResults TriggerResultInserter 2431.361 1366.0365
28 PostProcessModule     1     27  1347  reco TriggerResults TriggerResultInserter 2431.361 1366.0365
29  PreProcessModule     1     27  1348  reco         reco3d      SpacePointSolver 2338.619 1275.1094
30 PostProcessModule     1     27  1348  reco         reco3d      SpacePointSolver 2400.309 1333.7887
31  PreProcessModule     1     27  1348  reco TriggerResults TriggerResultInserter 2400.309 1333.7887
32 PostProcessModule     1     27  1348  reco TriggerResults TriggerResultInserter 2400.309 1333.7887
33  PreProcessModule     1     27  1349  reco         reco3d      SpacePointSolver 2333.266 1269.8870
34 PostProcessModule     1     27  1349  reco         reco3d      SpacePointSolver 2333.807 1270.2966
35  PreProcessModule     1     27  1349  reco TriggerResults TriggerResultInserter 2333.807 1270.2966
36 PostProcessModule     1     27  1349  reco TriggerResults TriggerResultInserter 2333.807 1270.2966
37  PreProcessModule     1     27  1350  reco         reco3d      SpacePointSolver 2333.807 1270.2966
38 PostProcessModule     1     27  1350  reco         reco3d      SpacePointSolver 2400.985 1335.3124
39  PreProcessModule     1     27  1350  reco TriggerResults TriggerResultInserter 2400.985 1335.3124
40 PostProcessModule     1     27  1350  reco TriggerResults TriggerResultInserter 2400.985 1335.3124

It seems the memeory increase seen before was associated with RootOutput. Sometimes reco3d does use a lot of memory (e.g. event 1346).

#2 Updated by Lynn Garren about 3 years ago

  • Status changed from New to Assigned
  • Assignee set to Christopher Backhouse

Chris, would you take a look at this? Let us know if you need help.



Also available in: Atom PDF