Project

General

Profile

Bug #14508

Simple flux based GENIE jobs can not run on the FNAL grid system

Added by Dominic Brailsford about 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Simulation
Target version:
Start date:
11/15/2016
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Duration:

Description

The flux files for simple flux GENIE jobs are stored on bluearc: /sbnd/data/flux

A simple flux GENIE job interrogates this directory, picks a random subset of the flux files and copies them to the job location.
Unfortunately, I don't think the /sbnd/data directory is mounted on the grid node which means any simple flux grid job is not able to access the flux files it needs.

I think that the only way to solve this issue is to move/copy the flux files to dcache (I guess persistent storage would be best).

History

#1 Updated by Dominic Brailsford about 3 years ago

As a check, I have copied 100 of the flux files to dache:
/pnfs/sbnd/persistent/flux

and redirected the GENIE fcl file to use the above directory in the flux search path.

A job submitted to the batch using these changes finishes successfully.

#2 Updated by Gianluca Petrillo about 3 years ago

I would say: go for it.
Can you name the directory so that it's clear which flux we are talking about? maybe BNB_FluxFiles (just to follow the convention established by Roxanne1), plus a subdirectory like v1 or 20141015, just in case we end up with multiple versions.
Also, please add a README file with information about where these files come from (actually, that should happen also in the original directory... who produced those files?).
In short: since we can't use revision control, we have to put together something similar, so that in the future our memory will be supported.

1 That Cosmics_FluxFiles directory should be actually superseeded since we have a shared directory /pnfs/larsoft/persistent/physics/cosmics for that. But BNB flux is SBND specific, so it's different.

#3 Updated by Dominic Brailsford about 3 years ago

Gianluca Petrillo wrote:

I would say: go for it.
Can you name the directory so that it's clear which flux we are talking about? maybe BNB_FluxFiles (just to follow the convention established by Roxanne1), plus a subdirectory like v1 or 20141015, just in case we end up with multiple versions.

I absolutely can. The only reason I chose 'flux' this time around was because that is the name currently used in /sbnd/data which I agree is uninformative.

Also, please add a README file with information about where these files come from (actually, that should happen also in the original directory... who produced those files?).

I raised this issue in a sim meeting not too long ago. I think that information has been lost to time. This is deviating from the issue at hand a bit but I think that in future flux iterations, all flux xml and request files should be committed to sbndutil (similar to how the mcc scripts will be). I can extract config information from the current flux files but I think that is all we have to go on in terms of where the files come from.

#4 Updated by Dominic Brailsford almost 3 years ago

  • Category changed from Environment to Simulation
  • Status changed from New to Assigned
  • Assignee set to Dominic Brailsford

I'm going to start tackling this today.

I've done some thinking about how to arrange the flux files and I think we could separate the files that we have according to the following categories:
  1. Which BNB files were used to make our flux files. We currently only have april07 files available.
  2. Which gsimple version was used to make our gsimple flux files from the BNB files. For example gsimple/v2_8_6d.
  3. The BNB running mode e.g. neutrino running
  4. The configuration used for the beam e.g. ConfigA (I'll explain this in more detail below)

So, for example, a hypothetical path to some flux files would look like this:

/pnfs/sbnd/persistent/BNB_FLuxFiles/april07/gsimple/v2_8_6d/neutrinoRunning/configA/MYFLUXFILES.root

The configuration (e.g. configA) encapsulates all of the stuff that goes into the xml file used to produce the gsimple files. There are way too many details that go into the xml files to have them all listed in the directory path so we should label each configuration with a unique identifier and have that documented somewhere. A possible approach is to have the xml file stored in the sbndutil repository (along with the request files) and also have a more qualitative description of each configuration on the sbndcode wiki.

One point to note is that we are going to end up with flux files with different baselines (100m and 110m) and the baseline features directly in the xml files used to generate the gsimple files. If we end up with a lot of configurations which use each baseline then I think we should specify the baseline separately. However, I imagine that besides the 100m flux files we already have, we probably won't generate anymore so it's not worth specifying the baseline separately in the flux directory paths.

#5 Updated by Gianluca Petrillo almost 3 years ago

I like the plan. Of course, I being I, I will comment on little details, for your consideration:

  • unless we plan to have different gsimple versions produced at the same time, I feel the version of gsimple can be demoted to the documentation
  • I personally would have FluxFiles/BNB (and FluxFiles/NUMI) rather than BNB_FluxFiles (for Corsika too, but we might just use the ones in /pnfs/larsoft)
  • I strongly recommend that, in addition to the documentation you already outlined, the XML configuration files you mention be stored in the same directories as their output, too
  • the config label should include an informal "version" number (either a sbndutil tag, or a simple _v1, _v2) so that it can be distinguished from updates; or we can turn the "april07" into a more precise 20160407 and use that to identify the sbndutil tag
  • if there is some exceptional setting (like 100 metres baseline), that might be captured in the configuration name

One important note: ifdh ls, used to retrieve flux files, is apparently recursive. That means that if we can't have directories with both flux files and subdirectories with incompatible flux files, because when we ask for the files in the former, the ones in the latter would also be included. Thanks to MicroBooNE for finding this (the hard way).

#6 Updated by Dominic Brailsford almost 3 years ago

Thanks for the feedback Gianluca. As a thank you here is a wall of text.

Gianluca Petrillo wrote:

  • unless we plan to have different gsimple versions produced at the same time, I feel the version of gsimple can be demoted to the documentation

My thinking here was at some point in the future we will settle on a beam configuration but will periodically want to reproduce the gsimple files due to updates to gsimple (bug fixes etc). If we specify the gsimple version in the directory path, then we can easily install the updated flux files on pnfs without impacting the old ones. My concern here is if someone is using the older flux files at the point we want to store the newer flux files. Without specifying the gsimple version, I'd imagine we would replace the old files with the new ones which could cause issues for said users.

  • I personally would have FluxFiles/BNB (and FluxFiles/NUMI) rather than BNB_FluxFiles (for Corsika too, but we might just use the ones in /pnfs/larsoft)

Very much agreed.

  • I strongly recommend that, in addition to the documentation you already outlined, the XML configuration files you mention be stored in the same directories as their output, too

Normally I'd strongly agree with this but it is a little bit at odds with how I was imagining we store the XML.
My thinking was that we have exactly one XML file, which stores all of the beam configurations as we can then make one configuration inherit from another one. e.g. our nominal flux uses a 100m baseline and we will want to update that to 110m without changing any other parameters. In the XML file, the second configuration can inherit from the first and then only change the baseline distance.

Actually... we could still store something in the output area. After producing a new set of flux files, we could copy the XML file as it was to the output area and never touch that copy again. It will obviously diverge from the contents of the XML stored in sbndutil but that isn't necessarily a bad thing.

  • the config label should include an informal "version" number (either a sbndutil tag, or a simple _v1, _v2) so that it can be distinguished from updates; or we can turn the "april07" into a more precise 20160407 and use that to identify the sbndutil tag

What do you mean by 'updates' here? Updates to the input flux files or inputs to the configuration/XML file itself? If the former, then this would be reflected by changing 'april07' to whatever label the new input flux files have. If you are referring to the latter then I was planning on that being named a different configuration i.e. ConfigA -> ConfigB rather than ConfigA_v1 -> ConfigA_v2

  • if there is some exceptional setting (like 100 metres baseline), that might be captured in the configuration name

I agree, though it does ruin file naming symmetry a little bit.

One important note: ifdh ls, used to retrieve flux files, is apparently recursive. That means that if we can't have directories with both flux files and subdirectories with incompatible flux files, because when we ask for the files in the former, the ones in the latter would also be included. Thanks to MicroBooNE for finding this (the hard way).

If I'm understanding you correctly, I think we can partially avoid this issue by having sensible file naming as GENIEHelper requires both a flux file path and a flux file pattern:

 physics.producers.generator.FluxSearchPaths: "/pnfs/sbnd/persistent/flux" 
 physics.producers.generator.FluxFiles: [ "gsimple_microboone-100-onaxis_numintp_*.root" ]

Provided the file names for the incompatible flux files look nothing like the compatible ones and we suitably ask for the compatible names then we should be OK right?
Also, the storage scheme outlined above would mean there are no subdirectories in a directory which contains gsimple flux files e.g. this wouldn't happen:

/pnfs/sbnd/persistent/BNB_FLuxFiles/april07/gsimple/v2_8_6d/neutrinoRunning/configA/MYFLUXFILES.root
/pnfs/sbnd/persistent/BNB_FLuxFiles/april07/gsimple/v2_8_6d/neutrinoRunning/configA/incompatible_flux_files/MYINCOMPATIBLEFLUXFILES.root

#7 Updated by Dominic Brailsford almost 3 years ago

Done.

Very rough wiki documentation can be found here:
https://cdcvs.fnal.gov/redmine/projects/sbndcode/wiki/The_SBND_flux_files

The actual flux files can now be found here:

/pnfs/sbnd/persistent/fluxFiles/bnb/gsimple/unknown/configA-100m-v1/april07/neutrinoMode

and brand new shiny flux files with a 110m baseline can be found here:

/pnfs/sbnd/persistent/fluxFiles/bnb/gsimple/v2_8_6d/configB-v1/april07/neutrinoMode

I'll leave the bug open until we decide which set of flux files should be used by default i.e. what the fcl files get updated with. I imagine we will know after the next sim/soft meeting.

#8 Updated by Dominic Brailsford over 2 years ago

  • Status changed from Assigned to Closed

The production-level fcl files now use the 110m baseline gsimple files (configB).
I think we are probably done here so I'm closing this issue.



Also available in: Atom PDF