Project

General

Profile

Support #25349

Memory leak in art::Ptr::Get()

Added by Thomas Junk 2 months ago. Updated 2 months ago.

Status:
Assigned
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
12/23/2020
Due date:
% Done:

0%

Estimated time:
Scope:
Internal
Experiment:
-
SSI Package:
Duration:

Description

After chasing a memory leak in a ProtoDUNE analysis ntupling job, I narrowed it down to a function that looks through associations and picks one. It calls getValidHandle and FindManyP and art::Ptr::Get() many times in the same event, looping
over particles. I constructed a gallery example that illustrates the problem. It reads in 15 events and uses 5 GB of memory.
Adjust nevents to make it read in more events and use more memory. The gallery example allows quick experimentation. You'll
need access to DUNE's data using xrootd -- let me know if you would like a copy of the input file deposited somewhere. I tested
it with larsoft v9_10_02.

gtest.C (1.96 KB) gtest.C Thomas Junk, 12/23/2020 10:26 AM
Screen Shot 2020-12-23 at 3.27.25 PM.png (328 KB) Screen Shot 2020-12-23 at 3.27.25 PM.png Kyle Knoepfel, 12/23/2020 03:43 PM

History

#1 Updated by Kyle Knoepfel 2 months ago

  • Status changed from New to Assigned

#2 Updated by Thomas Junk 2 months ago

Yes, you are right in that the art::Ptr::get() call does not seem to use more memory if repeatedly called on the same collection, but if more events are read in, memory usage keeps going up. So the comment in the code about the get() always rereading the same data is incorrect.

#3 Updated by Kyle Knoepfel 2 months ago

At the moment, this does not appear to be a memory error directly related to art::Ptr::get. I will leave this issue in the art tracker for now, but I suspect it is a problem with how the I/O rules have been specified for recob::Track in lardataobj. My correspondence with Philippe Canal is below.

Hi Philippe,

I'm fairly certain I've detected a memory leak in an I/O rule, and I'm wondering if you can help me figure it out.

Tom (cc'd) has provided me with a ROOT file that includes a std::vector<recob::Track> object, where the recob::Track class version is 13. When the data product is read from disk, there is significant memory growth--which you would expect when reading something into memory. However, the same memory increase happens when a data product of the same type is read from disk for the next event. See the attached screen shot.

I would not expect the memory to continue to grow significantly after the first event. And this is a single-threaded job with one event in flight at a time.

The associated I/O rule is below--there is no '<version ClassVersion="13"...>' for recob::Track anywhere in the classes_def.xml file.

The relevant source code is here:

https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/Track.h
https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/TrackTrajectory.h
https://github.com/LArSoft/lardataobj/blob/develop/lardataobj/RecoBase/TrackingDicts/classes_def.xml
...

Do you see any obvious problems? Or will it require more complete instructions to investigate?

Sorry for the late support request!

Thanks,
Kyle

Relevant I/O rule for recob::Track

    <!-- class -->
  <class name="recob::Track" ClassVersion="17">
   <version ClassVersion="17" checksum="738708267"/>
   <version ClassVersion="16" checksum="1293628079"/>
   <version ClassVersion="15" checksum="2420564911"/>
    <version ClassVersion="14" checksum="2345363916"/>
  </class>

  <ioread
    version="[-14]" 
    sourceClass="recob::Track" 
    source="std::vector<TVector3> fXYZ; std::vector<TVector3> fDir; std::vector<double> fFitMomentum" 
    targetClass="recob::Track" 
    target="fTraj" 
    include="lardataobj/RecoBase/Track.h;TVector3.h;vector;stdexcept">
    <![CDATA[
      // also uses "larcoreobj/SimpleTypesAndConstants/PhysicalConstants.h" // util::kBogusD

      // trajectory
      if((onfile.fXYZ.size() != onfile.fDir.size()) || onfile.fXYZ.size()<2) {
         std::cerr
         << "\n *** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * " 
         << "\n **** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * " 
         << "\n *** " 
         << "\n ***  ERROR!! ROOT I/O failure. " 
         << "\n ***  Trying to read from file a recob::Track with only " << onfile.fXYZ.size() << " points. " 
         << "\n ***  The data product containing this track is UNUSABLE. " 
         << "\n ***  In case of questions contact cerati@fnal.gov and petrillo@fnal.gov. " 
         << "\n ***  (printed from: lardataobj/RecoBase/classes_def.xml) " 
         << "\n *** " 
         << "\n **** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * " 
         << "\n *** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * " 
         << "\n" << std::endl;
        throw std::runtime_error(
          "Direction and position vector size mismatch during import of Track v. <=14" 
          );
      }

      // prepare vector of positions
      recob::Track::Positions_t pos;
      pos.reserve(onfile.fXYZ.size());
      for (auto const& p: onfile.fXYZ)
        pos.emplace_back(p.X(), p.Y(), p.Z());

      // prepare vector of directions
      recob::Track::Momenta_t mom;
      mom.reserve(onfile.fDir.size());
      for (auto const& d: onfile.fDir)
        mom.emplace_back(d.X(), d.Y(), d.Z());

      // upgrade to momentum if the input track has exactly two momenta,
      // or if it has exactly the number of momenta required
      auto const size = mom.size();
      bool hasMom = false;
      if ((onfile.fFitMomentum.size() == 2)
        && (onfile.fFitMomentum[0] != util::kBogusD)
        && (onfile.fFitMomentum[1]!=util::kBogusD)
        )
      {
        hasMom = true;
        for (unsigned int i = 0; i < (size - 1); ++i)
          mom[i] *= onfile.fFitMomentum[0];
        mom.back() *= onfile.fFitMomentum[1];
      }
      else if (onfile.fFitMomentum.size() == size) {
        hasMom = true;
        for (unsigned int i = 0; i < size; ++i)
          mom[i] *= onfile.fFitMomentum[i];
      }

      // reinitialise the trajectory data member
      fTraj = recob::TrackTrajectory(
        std::move(pos), std::move(mom),
        recob::TrackTrajectory::Flags_t(size),
        hasMom
        );
    ]]>
  </ioread>

Also available in: Atom PDF