Project

General

Profile

Production Dataset Naming Conventions

We expect that we will be adding to the initial samples generated for an analysis campaign as time and computing resources allow. To allow for this, we define a dataset naming scheme that should accommodate this naturally. We will use this convention starting with second analysis.

In the case of data we define epoch or period specific data samples:

prod_DATATIER_RELEASE_DETECTOR_STREAM_TIMESPEC_SUBVERSION_SPECIAL

The@STREAM@ field indicates the trigger stream (eg numi or comsics). The TIMESPEC field indicates the data-taking period or epoch involved, for example period2 or epoch3b. Standard NOvA data-taking periods are documneted here. We will also define a TIMESPEC value full which removes any constraints on the data-taking period. The optional SUBVERSION field indicates multiple passes through the processing chain, and is inherited by child datasets. If no subversion is specified, there should be a constraint to subversion '1' in the dataset. The optional SPECIAL field indicates any special parameters, for example to indicate a sample with the calibration shifted according to systematic its systematic uncertainty. It can be as long as required, and even add more underscores, but it should reflect any other constraints which give the definition meaning.

For Monte Carlo, the convention is:

prod_DATATIER_RELEASE_DETECTOR_GEN_FLAVORSET_HORN_TIMESPEC_TIMESTAMP_SUBVERSION_SPECIAL

We replace STREAM with GEN, which is indicates the generator (eg CRY or genie). For neutrino MC, we also add the FLAVORSET field (most commonly nonswap, fluxswap or tau) and the HORN field, indicating the horn polarity (fhc, rhc or 0hc). We also add the TIMESTAMP field which corresponds to the timestamp in the fcl file names that is passed down to all the descendant files. This naturally allows us to track multiple generation rounds to produce extra statistics. As with TIMESPEC, we will also define a global set where this parameter is set to full that includes all statistics.

The Old Convention

Historically this has been the old convention for naming production datasets:

prod_DATATIER_RELEASE_DETECTOR_FLAVORSET_SPECIAL

For MC, FLAVORSET corresponds to GENIE/CRY. For data, it is replaced with the trigger stream. The SPECIAL bit can be as long as required, and even add more underscores, but it should reflect any other constraints which give the definition meaning.