Project

General

Profile

Support #10545

Returning false from EDFilter::beginSubRun() does not skip the events

Added by Christopher Backhouse almost 4 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
Infrastructure
Target version:
-
Start date:
10/15/2015
Due date:
% Done:

100%

Estimated time:
Spent time:
Scope:
Internal
Experiment:
NOvA
SSI Package:
art
Duration:

Description

I have a filter module that in some cases needs to inspect individual events, but in other cases can decide that a whole subrun isn't going to meet its requirements. I return false from beginSubRun, but we still have to spend the time to call my filter() function thousands of times. Am I missing a fcl setting somewhere that would enable this behaviour?

Even if it's not possible or desirable to make such a return remove the entire subrun from the output file (see the other issue I just commented on) can it at least automatically reject all the events in the subrun without having to consult the filters?

Same thing goes for runs.

debug_xrootd.png (34 KB) debug_xrootd.png Kyle Knoepfel, 11/16/2015 09:26 AM
profile_xrootd.png (33.8 KB) profile_xrootd.png Kyle Knoepfel, 11/16/2015 09:26 AM
profile_local.png (34.6 KB) profile_local.png Kyle Knoepfel, 11/16/2015 09:26 AM
nova_debug_xrootd_backhouse.map (1.73 MB) nova_debug_xrootd_backhouse.map Original .map file from Chris Kyle Knoepfel, 11/16/2015 09:30 AM
nova_debug_xrootd.map (1.61 MB) nova_debug_xrootd.map Kyle Knoepfel, 11/16/2015 09:30 AM
nova_profile_xrootd.map (783 KB) nova_profile_xrootd.map Kyle Knoepfel, 11/16/2015 09:31 AM
nova_prof_local.map (711 KB) nova_prof_local.map Kyle Knoepfel, 11/16/2015 09:31 AM

Related issues

Related to art - Feature #4916: Implement SelectRuns and SelectSubRunsFeedback11/06/2013

Related to art - Support #4895: No way to filter runs/subruns out of output fileClosed11/01/2013

Related to art - Feature #4998: Control whether RootOutput stores any Events or SubrunsClosed11/21/2013

History

#1 Updated by Kyle Knoepfel almost 4 years ago

We will discuss this at the next art triage meeting (next Monday). However, just so you know, the current implementation respects the boolean return value only for events. For runs and subruns, a returned value of false is overridden with true.

#2 Updated by Christopher Green almost 4 years ago

  • Related to Feature #4916: Implement SelectRuns and SelectSubRuns added

#3 Updated by Christopher Green almost 4 years ago

  • Related to Support #4895: No way to filter runs/subruns out of output file added

#4 Updated by Christopher Green almost 4 years ago

  • Related to Feature #4998: Control whether RootOutput stores any Events or Subruns added

#5 Updated by Christopher Green almost 4 years ago

  • Category set to Infrastructure
  • Status changed from New to Feedback
  • SSI Package art added
  • SSI Package deleted ()

There is a design bug, here: the semantics behind actually acting on a filter result in begin and end subrun (or run) are wildly open to interpretation (what does it mean to return false from endSubRun, for example, and does the answer apply to all modules or just this one), and the boolean return from these functions has never been used. Ideally, we would change these functions to return void, but that would be a major breaking change to the module interfaces.

Given the distinction between the design bug described above (and the consequences of fixing it by changing the interface) and what you actually appear to want from your description, we will solicit stakeholder feedback on the narrow issue of changing module interface to remove the Boolean return for functions where there is currently no semantic for same.

If NOvA wishes to request the feature you allude to, we will need a much wider design discussion in order to be sure we are addressing your needs and not excluding reasonable related needs from other experiments.

#6 Updated by Christopher Backhouse almost 4 years ago

I think the semantics of a false return from begin[Sub]Run() are clear enough. I'd be happy to convert end[Sub]Run() to a void return. Can we discuss this at a stakeholders meeting?

#7 Updated by Rob Kutschke almost 4 years ago

I second (or third?) the request to discuss at a stakeholders meeting.

#8 Updated by Kyle Knoepfel almost 4 years ago

[ Note: The motivation for this issue was that certain art processes that were intended to filter events from specific subruns were executing in a seemingly time-inefficient way. It was decided that the way to address this issue was to run a profiler on the job in question to determine more precisely where the job was spending its time.]

I have been able to run the jobs that you pointed to me earlier, running over the four pnfs files, using RunEventFilter.fcl as the configuration. I looked at three cases:

1) debug build with pnfs file access via xrootd
2) profile build with pnfs file access via xrootd

3) profile build with local file access

For case 3, I copied the pnfs files to the same directory in which I was running the nova exec. The general break-down is as follows:

   Mode              | Wind-up | Event loop |  Total
===================================================
1) Debug   + xrootd  | 10.5 s  |   33.5 s   |  44 s
2) Profile + xrootd  |  8.1 s  |   11.9 s   |  20 s
3) Profile + local   |  6.8 s  |   11.2 s   |  18 s

The results for mode 1 are quite a bit different than what I obtained reading the map file you produced for me with the debug build. However, the results I get are more in accord with what you originally reported for this job.

The profile modes are clearly superior in terms of efficiency (and memory consumption, see numbers in images below). The most important metric is the time it takes to execute the event loop — a profile build gives you an improvement of a factor of 3 wrt the debug mode.

Although in my tests there was not a large difference when using xrootd or local files, using xrootd does introduce the necessity of synchronization—in the map file you sent us, 40% of the event-loop time was spent waiting via pthread_cond_wait. Although the synchronization percentage was significantly smaller in my test jobs (~15%), using xrootd does introduce another potential source of time inefficiency.

The following images show the summary of the profiles. I have also attached the .map files in case others would like to analyze them with allinea.

Debug with xrootd

Profile with xrootd

Profile with local file access

#9 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 0 to 100


Also available in: Atom PDF