Project

General

Profile

Bug #13765

RangeSet merging is very slow for concatenated files

Added by Kyle Knoepfel almost 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Target version:
Start date:
09/02/2016
Due date:
% Done:

100%

Estimated time:
24.00 h
Spent time:
Scope:
Internal
Experiment:
DUNE
SSI Package:
art
Duration:

Description

Tingjun from DUNE reports that reading in RangeSets from art/ROOT files produces with version 2.03.00 takes hours when an input file is read, where the input file was a result of a concatenation job from 400 input files.

See attached file.

out.root (5.96 MB) out.root Kyle Knoepfel, 09/02/2016 02:47 PM

Associated revisions

Revision 06be67f5 (diff)
Added by Kyle Knoepfel almost 4 years ago

Fix exorbitant time taken to merge/collapse RangeSet information (resolves issue #13765).

The previous implementation had accommodated for situations that were
not to be encouraged--e.g. namely the following set of EventRanges:

[1,4)
[1,11)
[4,11)

In the previous implementation, such a set of EventRanges is sorted
(by the weak-ordering criterion), and it would collapse to one
EventRange, namely [1,11).

However, the above situation is ill-defined in the art context since
the collection of events corresponding to a given product/auxiliary
must be unique. In other words, a RangeSet is not meant to contain
duplicate events. It is permissible for two separate RangeSets to
contain duplicate events since they are separate entities. But the
previous situation was trying to support a situation where a RangeSet
was internally inconsistent.

History

#1 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from New to Assigned
  • Assignee set to Kyle Knoepfel
  • Estimated time set to 24.00 h

We are profiling to discover the bottleneck.

#2 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 0 to 100

The bottleneck was due to an implementation that was trying to accommodate a broader concept than what the RangeSet should have supported. By clarifying the intent of the RangeSet object, and by some function refactoring, reading in the file takes on the order of few seconds (1.8 sec. for us using the profile build where we do nothing but read in the file).

Implemented with commits canvas:06be67f5 and art:6d43478.

#3 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from Resolved to Closed
  • Target version set to 2.04.00


Also available in: Atom PDF