Project

General

Profile

Flux File Handling

The following controls come into play when the FileType is one of ntuple (via GNuMIFlux), simple_flux (via GSimpleNtpFlux) or dk2nu (via GDk2NuFlux). Each of these drivers require a substantially sized number of TTree entries which are generally in a set of files.

Basic principals

The general strategy is to take list of paths, and a list of "patterns" (wild carded globs that might include some sub-directory paths) and combine those and expand to form a list of potential files to use for flux information.
Each job then randomizes that set, and starts selecting a subset that represents no more than 2 GB of input (user adjustable), from which it reads.

Depending on the search path environment sent to the job the selected could be read directly from BlueArc disk or files in CVMFS. NOvA's prior experience was that either of these worked well with no apparent I/O problems -- read rates are limited because of internal ROOT buffering and subsequent steps in the job taking long enough (e.g. Geant4) so as not to be I/O bound. There are concerns that off-site squid caches in the CVMFS system might be inadequately sized and file thrashing could occur. Also there have been issue with "publishing" larger file set sizes into the CVMFS catalogue. These concerns have led to an alternative implementation for file handling that makes local copies of the files (from BlueArc or PNFS).

Direct access vs. local copies

Control whether files are read directly (BlueArc or CVMFS) or a local copy is made is determined by the setting of the FluxCopyMethod fcl parameter. Use "DIRECT" to continue with the old approach. Use "IFDH" to make local copies; if the setting is neither then it will use the IFDH interface but override the default method of fetching files with the given string.

Search paths

The FluxSearchPaths fcl parameter is a colon separated list of paths to search. The parameter can include environment variables that will be expanded. It is important to make this relatively specific in the case of IFDH as a list of all files 10 directories deep will be generated for each given path. Thus /nova/data/flux/gsimple would be acceptable, but /nova/data is problematic. In the case of DIRECT access, if this is not set (i.e. a blank string), then for backward compatibility it falls back to using the environment variable ${FW_SEARCH_PATH}.

When using the IFDH interface no special prefix should be necessary, just specify normal BlueArc or PNFS paths.

Important note:

If while using "DIRECT" you wish to give the full file path in the FluxFiles parameter then this path list should contain a trailing colon : in addition to some path specification (a colon by itself might lead to the same files being duplicated in the list).

There is currently no way to specify a full path in FluxFiles when using the IFDH interface (with or without wildcards). The file specification must be broken into an indexable list of paths and sub-components.

Limits of selected files

The MaxFluxFileMB controls the size of the data (given in MB) to be read by the driver. Once the complete list of files matching the specification is made, it is randomized and files are added to the selected list until the next file would force the size limit to be exceeded. In all cases at least one file will be selected.

Flux file specifics

The FluxFiles parameter is a list of files including subdirectory paths and wildcards. These in conjunction with the components of the FluxSearchPaths should identify the desired files.

Local copies and end of job cleanup

In the IFDH case local copies are made. Where they end up depends on a variety of environment variables according to the priority cascade used internally in the ifdh library:

if set ${IFDH_DATA_DIR} use ${IFDH_DATA_DIR}
else if set ${_CONDOR_SCRATCH_DIR} use ${_CONDOR_SCRATCH_DIR}/ifdh_<uid>_<pgrp>
else if set ${TMPDIR} use ${TMPDIR}/ifdh_<uid>_<pgrp>
else use /var/tmp/ifdh_<uid>_<pgrp>

When the job is complete and GENIEHelper destructor is called it will attempt to clean up files based on the FluxCleanup fcl parameter. This can take the values "NEVER", "ALWAYS", or "/var/tmp". The first should be obvious; the second will try to rm -rf the directory used for the copy; and the last will remove individual flux files that this job moved to /var/tmp.

The system-wide automatic cleanup policy of locations varies. The ${_CONDOR_SCRATCH_DIR} area will get removed with the condor job, so there is no need to do anything. Who sets the ${IFDH_DATA_DIR} environment variable and the policy for cleanup is left to the experiments. Unfortunately GENIEHelper can not directly query which of these areas was used, so be wary of the combination of ${IFDH_DATA_DIR} and "ALWAYS" unless one is very sure of the policy of sharing between jobs.

Important note:

Interactive jobs are likely to default to using /var/tmp. This space on interactive nodes is large-ish, but not huge; often on the scale of 7.5 GB. This can lead to filling that volume if more than 3 jobs are run simultaneous with a 2GB limit. Additionally, since cleanup happens during GENIEHelper destruction an early termination of the job might cause the files to be orphaned and remain behind.

Examples

physics.producers.generator.FluxSearchPaths: "/nova/data/flux/gsimple/${NOVA_FLUX_VERSION}:/alternative/path" 
physics.producers.generator.FluxFiles: [ "ndrock/mn/fhc/gsimple_NOvA-NDRock*.root" ]
physics.producers.generator.FluxCopyMethod:  "IFDH"  # or "DIRECT" 
physics.producers.generator.MaxFluxFileMB:  2000     # 2 GB limit per job

assumes that there is an environment variable ${NOVA_FLUX_VERSION} set to something like nova_v07.

Miscellaneous

IFDHC = Intensity Frontier Data Handling Client tools

Some wiki pages referencing the ifdh library

https://cdcvs.fnal.gov/redmine/projects/ifdhc/wiki/
https://cdcvs.fnal.gov/redmine/projects/ifdhc/wiki/Shared_File_Access
https://cdcvs.fnal.gov/redmine/projects/ifdhc/wiki/C++_Usage
https://cdcvs.fnal.gov/redmine/projects/ifdhc/wiki/Ifdh_commands
https://cdcvs.fnal.gov/redmine/projects/ifdhc/wiki/V1_7_1

If the environment variable ${IFDH_DEBUG_LEVEL} is set to a "1" through to "9" that will trigger this instance of ifdh to spew additional, possibly enormous amounts of, information.