Project

General

Profile

Bug #2695

support for filtered datasets

Added by Andrei Gaponenko over 7 years ago. Updated almost 4 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
I/O
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
Spent time:
Occurs In:
Scope:
Internal
Experiment:
Mu2e
SSI Package:
art
Duration:

Description

A dataset produced by jobs that filter the outputs and write out only
selected events can legitimately include some files with zero events.
Such a dataset should be further analyzable in the framework. This is
not the case now, and an attempt to analyze a filtered dataset

mu2e -S filelist.txt -c ...

breaks with

cet::exception caught in art
---- MismatchedInputFiles BEGIN
RootInputFileSequence::mergeMPR() Branch 'mu2e::SimParticlemv_fvd36__allExtMonFNAL.' is in file '/data/mu2e/extmon/grid/20120502a-all-nominal/good/00005/gvd36.root'
but not in previous files.
Branch 'mu2e::StepPointMCs_fvd36__allExtMonFNAL.' is in file '/data/mu2e/extmon/grid/20120502a-all-nominal/good/00005/gvd36.root'
but not in previous files.
Branch 'mu2e::StepPointMCs_fvd36_extraHits_allExtMonFNAL.' is in file '/data/mu2e/extmon/grid/20120502a-all-nominal/good/00005/gvd36.root'
but not in previous files.
cet::exception caught in EventProcessor and rethrown
---- MismatchedInputFiles END

In this example files 0-4 were empty, and the job broke on the first non-empty file 5.

Can please this be fixed?

Andrei

filtered_ds_demo.fcl (1.13 KB) filtered_ds_demo.fcl Andrei Gaponenko, 10/05/2015 05:41 PM

Related issues

Has duplicate art - Feature #9803: Input files with no eventsRejected08/10/2015

Associated revisions

Revision abcd68b2 (diff)
Added by Kyle Knoepfel about 4 years ago

Relax input-concatenation criteria; implement fix for issue #2695

History

#1 Updated by Marc Paterno about 7 years ago

  • Status changed from New to Accepted
  • Priority changed from Normal to Low

This will be solved when we have reorganized the metadata system.

#2 Updated by Christopher Green about 6 years ago

  • Category set to I/O
  • Status changed from Accepted to Feedback
  • Start date deleted (05/03/2012)
  • Scope set to Internal
  • Experiment Mu2e added
  • SSI Package art added

It appears we may have given a slightly glib answer before. Looking at this issue again, we are not sure why the behavior you describe is occurring. Can you please clarify for us whether the empty files were written by a job with the same configuration as the one with the events? If this is true, we do not understand the failure, and would need to look at some example files which (as a group) suffer from this problem.

Please let us know.

#3 Updated by Andrei Gaponenko about 6 years ago

Yes, the files were written by identically configured jobs (different random numbers). Example files from a yeare ago would not be readable with the present mu2e code. I don't think I still have them.

One way to reproduce the problem is to create a filter module that accept or reject an event based on sub-run number. You already have a test producer. Run a [producer, filter] chain twice with sub-run numbers set to (1) reject, and (2) accept all events. Then try to read the two produced files, in order, in a single job.

Andrei

#4 Updated by Christopher Backhouse about 6 years ago

If this report is accurate it sounds serious.

I imagine we'll be running filters on our FD data to reject cosmics, and keeping only candidate neutrino events. This could easily average less than one a subrun.
Then we'd want to do the next step of analysis (more detailed reco etc) over all of these files.

Leaving empty files out of the dataset is not an option, because POT accounting information is in them too.

#5 Updated by Rob Kutschke about 6 years ago

My two cents. I agree that we must be able to read files that contain no events.

On the other hand, I am really writing this to comment on the use case mentioned by Christopher Backjouse. I recommend against keeping info like POT in data files. I strongly believe that this sort of info belongs in a separate data base. For example, what will you do if AD recalibrates an instrument that was involved in the determination of POT and publishes a revised set of POT numbers? Are you going to rewrite every file just to update that information?

#6 Updated by Kyle Knoepfel over 4 years ago

  • Target version set to 521

#7 Updated by Andrei Gaponenko about 4 years ago

Hello,

I hit the same problem again: art can not handle cases when the first
file in a list of input files have no events. I am attaching a fcl
file "filtered_ds_demo.fcl" that demonstrates the problem. To
reproduce:

$ ssh mu2egpvm01.fnal.gov
$ source /cvmfs/mu2e.opensciencegrid.org/setupmu2e-art.sh
$ source /cvmfs/mu2e.opensciencegrid.org/Offline/v5_4_7/SLF6/prof/Offline/setup.sh
$ mu2e -c filtered_ds_demo.fcl
....
....
%MSG
05-Oct-2015 17:29:57 CDT  Closed file /pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-beam-g4s1-extmonbeam.0506a.001002_00430142.art
%MSG-s ArtException:  PostCloseFile 05-Oct-2015 17:29:57 CDT PostEndRun
cet::exception caught in art
---- MismatchedInputFiles BEGIN
  RootInputFileSequence::initFile() Branch 'mu2e::GenParticles_generate__beamg4s1.' is in file '/pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-bea
m-g4s1-extmonbeam.0506a.001002_00430142.art'
      but not in previous files.
  Branch 'mu2e::SimParticlemv_extmonBeamFilter__beamg4s1.' is in file '/pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-beam-g4s1-extmonbeam.0506a.0
01002_00430142.art'
      but not in previous files.
  Branch 'mu2e::StatusG4_g4run__beamg4s1.' is in file '/pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-beam-g4s1-extmonbeam.0506a.001002_00430142.art'
      but not in previous files.
  Branch 'mu2e::StepPointMCs_extmonBeamFilter_extmonbeam_beamg4s1.' is in file '/pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-beam-g4s1-extmonbeam.0506a.001002_00430142.art'
      but not in previous files.
  Branch 'mu2e::StepPointMCs_extmonBeamFilter_virtualdetector_beamg4s1.' is in file '/pnfs/mu2e/phy-sim/sim/mu2e/cd3-beam-g4s1-extmonbeam/0506a/006/434/sim.mu2e.cd3-beam-g4s1-extmonbeam.0506a.001002_00430142.art'
      but not in previous files.
  cet::exception caught in EventProcessor and rethrown
---- MismatchedInputFiles END
%MSG

If you reverse the order of the two input files, the job will run
successfully.

We have datasets with 0.5M files; it is not practical to hand-tune the
order of inputs to different jobs. Please elevate the priority of this issue.

Andrei

#8 Updated by Kyle Knoepfel about 4 years ago

Andrei, is the error emitted during a process that creates and writes more products to a file? Or perhaps does additional event filtering?

#9 Updated by Kyle Knoepfel about 4 years ago

  • Status changed from Feedback to Assigned
  • Assignee set to Kyle Knoepfel
  • Target version changed from 521 to 1.17.00
  • % Done changed from 0 to 80

Coincidentally, the behavior for concatenating input files was investigated in depth over the last few weeks, before Andrei's update of this issue. Based on that analysis, it has been determined that implementing the fix is straightforward. The implementation for this fix is complete, and all art tests pass. What remains is to test the implementation for the specific Mu2e use case Andrei has provided for us.

#10 Updated by Kyle Knoepfel about 4 years ago

#11 Updated by Kyle Knoepfel about 4 years ago

  • Status changed from Assigned to Resolved
  • % Done changed from 80 to 100

In addition to permitting the first input file to have no events, we have relaxed the input-concatenation criteria so that Run and SubRun products that were not present in the first file can be present in subsequent input files. We are working on documentation to explain the requirements for concatenating input files. In general, art is now more flexible, and any jobs that completed successfully before will still do so.

Implemented with commit art:abcd68b.

#12 Updated by Kyle Knoepfel almost 4 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF