11/15/2014

add 14db ideal conditions MC datasets for daq/reco/pid/caf stages

# Official Datasets¶

The vast majority of NOvA's data and MC reside within a data handling system called Sequential Access Metadata (SAM). The files are stored on a tape-backed disk array called dCache. Locations for the files are tracked using a database with a web interface, accessible through the `samweb`

command line client utility. The SAM_web_cookbook provides an introduction to using the system.

As a throwback to the old days, some of the data is still stored on bluearc. CAFs are a notable group of files of which the most recent are almost always stored on bluearc. There are also small, selected samples of ART files available on bluearc in <some location, documented where?>

**Table of contents**- Official Datasets
- Monte Carlo
- Latest datasets for detector-like conditions MC
- Latest datasets for ideal conditions (14db) MC
- Latest datasets for regular configurations
- GENIE
- CRY
- DATA

# Monte Carlo¶

# Latest datasets for detector-like conditions MC¶

FarDet |
FHC swap |
FHC nonswap |
RHC swap |
RHC nonswap |
FHC tau |
Cosmics |

artdaq |
prod_daq_FA14-10-03_fd_genie_fhc_fluxswap | prod_daq_FA14-10-03_fd_genie_fhc_nonswap | prod_daq_FA14-10-03_fd_genie_fhc_tau | prod_daq_FA14-10-03_fd_cry_all | ||

reco |
prod_reco_FA14-10-28_fd_genie_fhc_fluxswap | prod_reco_FA14-10-28_fd_genie_fhc_nonswap | prod_reco_FA14-10-28_fd_genie_fhc_tau | prod_reco_FA14-10-28_fd_cry_all | ||

pid |
prod_pid_FA14-10-28_fd_genie_fhc_fluxswap | prod_pid_FA14-10-28_fd_genie_fhc_nonswap | prod_pid_FA14-10-28_fd_genie_fhc_tau | prod_pid_FA14-10-28_fd_cry_all | ||

caf |
prod_caf_FA14-10-28_fd_genie_fhc_fluxswap | prod_caf_FA14-10-28_fd_genie_fhc_nonswap | prod_caf_FA14-10-28_fd_genie_fhc_tau | prod_caf_FA14-10-28_fd_cry_all |

# Latest datasets for ideal conditions (14db) MC¶

FarDet |
FHC swap |
FHC nonswap |
RHC swap |
RHC nonswap |
FHC tau |
Cosmics |

artdaq |
prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_14db | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_14db | prod_daq_FA14-10-03_fd_genie_fhc_tau_14db | |||

reco |
prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_tau_14db | |||

pid |
prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_pid_FA14-10-28_fd_genie_fhc_tau_14db | |||

caf |
prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_caf_FA14-10-28_fd_genie_fhc_tau_14db |

# Latest datasets for regular configurations¶

FarDet |
FHC swap |
FHC nonswap |
RHC swap |
RHC nonswap |
Cosmics |

artdaq |
prod_artdaq_S14-03-25_fd_genie_fhc_swap | prod_artdaq_S14-03-25_fd_genie_fhc_nonswap | prod_artdaq_S14-03-25_fd_genie_rhc_swap | prod_artdaq_S14-03-25_fd_genie_rhc_nonswap | prodartdaq_S14-02-05_fdcry |

reco |
prodreco_S14-03-25_FDGENIE_fhc_swap | prodreco_S14-03-25_FDGENIE_fhc_nonswap | prodreco_S14-03-25_FDGENIE_rhc_swap | prodreco_S14-03-25_FDGENIE_rhc_nonswap | prodreco_S14-03-25_FDCRY |

pid |
prod_pid_S14-05-12_fd_genie_neutrino2014 | ||||

caf |
prod_caf_S14-07-11_fd_genie_neutrino2014 |

NearDet |
FHC |
RHC |
Cosmics |

artdaq |
prodartdaq_S14-05-12_ndgenie_fix | - | prodartdaq_S14-05-12_ndcry |

reco |
prodreco_S14-08-01_ndgenie_fix | - | - |

pid |
prodpid_S14-08-19_ndgenie_fix | - | - |

caf |
prodcaf_S14-08-19_ndgenie_fix | - | - |

# GENIE¶

## ARTDAQ¶

### FD¶

Real detector-like conditions MC**prod_daq_FA14-10-03_fd_genie_fhc_fluxswap****prod_daq_FA14-10-03_fd_genie_fhc_nonswap****prod_daq_FA14-10-03_fd_genie_fhc_tau**

Ideal conditions 14db MC**prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_14db****prod_daq_FA14-10-03_fd_genie_fhc_nonswap_14db****prod_daq_FA14-10-03_fd_genie_fhc_tau_14db**

Nominal

FHC SWAP/NONSWAP**prod_artdaq_S14-03-25_fd_genie_fhc_swap** (709)**prod_artdaq_S14-03-25_fd_genie_fhc_nonswap** (709)

RHC SWAP/NONSWAP**prod_artdaq_S14-03-25_fd_genie_rhc_swap** (709)**prod_artdaq_S14-03-25_fd_genie_rhc_nonswap** (709)

### ND¶

Fixed version of overlay files**prodartdaq_S14-05-12_ndgenie_fix** (4359)

**prodartdaq_S14-05-12_ndgenie** (20087)

### Special¶

Nominal geometry**prod_artdaq_S14-08-19_fd_genie_fhc_nonswap** (34)**prod_artdaq_S14-08-19_fd_genie_fhc_swap** (34)

Jittered geometry**prod_artdaq_S14-08-19_fd_fhc_nonswap_geojittered** (37)**prod_artdaq_S14-08-19_fd_fhc_swap_geojittered** (39)

CRY+GENIE**prod_artdaq_S14-08-19_fd_fhc_nonswap_crygenie** (510)**prod_artdaq_S14-08-19_fd_fhc_swap_crygenie** (511)

## RECO¶

Official production datasets containing all reconstructed files

### FD¶

Real detector-like conditions MC**prod_reco_FA14-10-28_fd_genie_fhc_fluxswap****prod_reco_FA14-10-28_fd_genie_fhc_nonswap****prod_reco_FA14-10-28_fd_genie_fhc_tau**

Ideal conditions 14db MC**prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_14db****prod_reco_FA14-10-28_fd_genie_fhc_nonswap_14db****prod_reco_FA14-10-28_fd_genie_fhc_tau_14db**

Nominal

ALL FD GENIE RECO: **prodreco_S14-03-25_FDGENIE** (2807 files)

FHC SWAP/NONSWAP: **prodreco_S14-03-25_FDGENIE_fhc_swap** (703)**prodreco_S14-03-25_FDGENIE_fhc_nonswap** (703)

RHC SWAP/NONSWAP**prodreco_S14-03-25_FDGENIE_rhc_swap** (698)**prodreco_S14-03-25_FDGENIE_rhc_nonswap** (703)

### ND¶

ND (FHC) GENIE (rock+detector) OVERLAY RECO**prodreco_S14-08-01_ndgenie_fix** (4295)

**prodreco_S14-07-03_ndgenie** (20057 and increasing)

### Special¶

Nominal geometry**prod_reco_S14-08-01_fd_genie_fhc_nonswap** (34)**prod_reco_S14-08-01_fd_genie_fhc_swap** (34)

Jittered geometry

Simulated with jitter and reco+ with standard geom**prod_reco_S14-08-01_fd_fhc_nonswap_geojittered_v5** (37)**prod_reco_S14-08-01_fd_fhc_swap_geojittered_v5** (39)

All tiers with geojittered geometry**prod_reco_S14-08-01_fd_fhc_nonswap_geojittered_v1** (34)**prod_reco_S14-08-01_fd_fhc_swap_geojittered_v1** (34)

CRY+GENIE**prod_reco_S14-08-01_fd_fhc_nonswap_crygenie** (510)**prod_reco_S14-08-01_fd_fhc_swap_crygenie** (511)

NO MASK

These files have bad channels turned off.**prod_reco_S14-10-15_genie_nonswap_nomask**

Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.

## PID¶

### FD¶

Real detector-like conditions MC**prod_pid_FA14-10-28_fd_genie_fhc_fluxswap****prod_pid_FA14-10-28_fd_genie_fhc_nonswap****prod_pid_FA14-10-28_fd_genie_fhc_tau**

Ideal conditions 14db MC**prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_14db****prod_pid_FA14-10-28_fd_genie_fhc_nonswap_14db****prod_pid_FA14-10-28_fd_genie_fhc_tau_14db**

OLD**prod_pid_S14-05-12_fd_genie_neutrino2014**

### ND¶

ND (FHC) GENIE (rock+detector) OVERLAY reco

**prodpid_S14-08-19_ndgenie_fix** (4072)

**prodpid_S14-08-01_ndgenie** (2715, will grow)

### Special¶

Nominal**prod_pid_S14-08-19_fd_genie_fhc_nonswap** (34)**prod_pid_S14-08-19_fd_genie_fhc_swap** (34)

Jittered geometry

Simulated with jitter and reco+ with standard geom**prod_pid_S14-08-19_fd_fhc_nonswap_geojittered_v5** (37)**prod_pid_S14-08-19_fd_fhc_swap_geojittered_v5** (39)

All tiers with geojittered geometry**prod_pid_S14-08-19_fd_fhc_nonswap_geojittered_v1** (34)**prod_pid_S14-08-19_fd_fhc_swap_geojittered_v1** (34)

CRY+GENIE**prod_pid_S14-08-19_fd_fhc_nonswap_crygenie** (486)**prod_pid_S14-08-19_fd_fhc_swap_crygenie** (492)

NO MASK

These files have bad channels turned off.**prod_pid_S14-10-28_nd_genie_nonswap_nomask**

Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.

The module "nuecosrej" is empty in the PID files. It is filled at the CAF stage for this set however.

## CAF¶

### FD¶

Real detector-like conditions MC**prod_caf_FA14-10-28_fd_genie_fhc_fluxswap****prod_caf_FA14-10-28_fd_genie_fhc_nonswap****prod_caf_FA14-10-28_fd_genie_fhc_tau**

Bluearc: /nova/prod/mc/FA14-10-28/genie/fd/caf/000XXX/XXXYY/

where XXX are the first three digits of the run number and YY are the last two.

You'll need to use appropriate wildcard flags.

Ideal conditions 14db MC**prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_14db****prod_caf_FA14-10-28_fd_genie_fhc_nonswap_14db****prod_caf_FA14-10-28_fd_genie_fhc_tau_14db**

Bluearc: /nova/prod/mc/FA14-10-28/genie/fd/caf/010000/100000X

where X is an integer from 0-5.

OLD**prod_caf_S14-07-11_fd_genie_neutrino2014**

### ND¶

SAM: **prodcaf_S14-08-19_ndgenie_fix** (4072)

Bluearc: **/nova/prod/mc/S14-08-19/genie/nd/caf** (4072)

First pass post-pid:**/nova/prod/mc/S14-08-15/genie/nd/caf/** (2441 files)

(find . -name "neardet*")

CAFMaker Floating point exception for all failures.

Caveat: Two significant issues have been found in this sample in the last day or two. An old version of PhotonTransport was used in the overlay production, making the light levels low by a factor of two, and also the rock overlay scheme re-used rock events at a much higher rate than intended. Both of these issues are fixed and replacements for these ND MC CAFs will tentatively be ready in a week we hope. We'll keep you posted.

The current CAFs should be useful for code development and certain physics studies in the meantime.

### Special¶

Nominal**prod_caf_S14-08-19_fd_genie_fhc_nonswap** (34)**prod_caf_S14-08-19_fd_genie_fhc_swap** (34)

Jittered geometry

Simulated with jitter and reco+ with standard geom**prod_caf_S14-08-19_fd_fhc_nonswap_geojittered_v5** (37)**prod_caf_S14-08-19_fd_fhc_swap_geojittered_v5** (39)

All tiers with geojittered geometry**prod_caf_S14-08-19_fd_fhc_nonswap_geojittered_v1** (32)**prod_caf_S14-08-19_fd_fhc_swap_geojittered_v1** (33)

CRY+GENIE**prod_caf_S14-08-19_fd_fhc_nonswap_crygenie** (484)**prod_caf_S14-08-19_fd_fhc_swap_crygenie** (490)

Bluearc:**/nova/prod/mc/S14-08-19/genie/fd/caf/**

(need to grab **fhc_nonswap** or **fhc_swap** plus either **crygenie** or **geojittered*v5**)

NO MASK

These files have bad channels turned off.**/nova/ana/nu_e_ana/S14-10-28/nd_nomask/mc**

Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.

These CAFs have "nuecosrej" filled.

# CRY¶

## ARTDAQ¶

Real detector-like conditions MC**prod_daq_FA14-10-03_fd_cry_all**

Nominal

ND CRY COSMICS:**prodartdaq_S14-05-12_ndcry**

FD CRY COSMICS:**prodartdaq_S14-02-05_fdcry**

## RECO¶

Real detector-like conditions MC**prod_reco_FA14-10-28_fd_cry_all**

Nominal

FD CRY COSMICS: **prodreco_S14-03-25_FDCRY** (10069)

## CALIBRATION (pchitlists)¶

ND CRY COSMICS:**prodcalib_S14-05-12_ndcry_pclist****prodcalib_S14-05-12_ndcry_pcliststop**

FD CRY COSMICS:**prodcalib_S14-03-05_FDCry_pclist****prodcalib_S14-03-05_FDCry_pcliststop**

## PID¶

Real detector-like conditions MC**prod_pid_FA14-10-28_fd_cry_all**

## CAF¶

Real detector-like conditions MC**prod_caf_FA14-10-28_fd_cry_all**

Bluearc: /nova/prod/mc/FA14-10-28/cry/fd/caf/000XXX/XXXYY/

where XXX are the first three digits of the run number and YY are the last two.

You'll need to use appropriate wildcard flags.

# DATA¶

## First-Analysis -- First Pass FD Processing (FA14-10-28)¶

In advance of the final calibration intended for first-analysis, a first pass with old calibration constants will be produced. The input to these samples is the official first-analysis run lists generated by Ryan Patterson and stored in this directory:

/nova/app/users/rbpatter/runlists/Those lists have been adapted into very specific datasets.

Data Tier |
Cosmic Trigger |
NuMI Trigger |

artdaq | `prod_artdaq_fd_cosmic_fa_goodruns` |
`prod_artdaq_fd_numi_fa_goodruns ` |

reco | `prod_reco_FA14-10-28_fd_cosmic_fa_goodruns` |
`prod_reco_FA14-10-28_fd_numi_fa_goodruns` |

## S14-09-29 Keep-up Reconstruction¶

Most of the same caveats apply as did for the S14-09-09 sample, but will be repeated below for completeness. The big change in the S14-09-29 is the addition of new information to facilitate basic neutrino searches. There is a new sel.containment branch in the CAFs and the numu CosRej has been included for the FD stream.

Keep-up reconstuction provides reconstructed art root files and CAFs. For quick testing the last three files from each of the below datasets is stored on bluearc and updated nightly. They can be found here:

/nova/prod/data/keepup/samples/

### ND with MC Calibration¶

In lieu of official ND calibration, the old MC calibration can be applied. The output of any module which relies on calibration should be taken with a heavy grain of salt, but these files will contain non-zero values for calibrated energies rather than zeroes.

The database has now been populated with bad channel masks, which have been applied in this round of processing.

The following SAM datasets are available.

Reco art rootfiles: *prod_reco_S14-09-29_neardet_numi_keepup_mccalib*

CAF: *prod_caf_S14-09-29_neardet_numi_keepup_mccalib*

The CAFs can also be found on bluearc:

/nova/prod/data/keepup/S14-09-29/numi/nd/000XXX/XXXYY/

where XXX are the first three digits of the run number and YY are the last two. Note, the MC calibrated files have "mccalib" in their name. The standard ones do not.

As of October 16, there is an FTS issue keeping these files from being transferred to bluearc. The CD experts have been notified and are working on a solution.

### ND (no calibration)¶

Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it. The most notable example is FuzzyKVertex, but the MichelE filters could also be affected. Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.

Near detector reconstruction is also currently lacking channel masks and data quality information.

The following SAM datasets are available.

Reco art rootfiles: *prod_reco_S14-09-29_neardet_numi_keepup*

CAF: *prod_caf_S14-09-29_neardet_numi_keepup*

The CAFs can also be found on bluearc:

/nova/prod/data/keepup/S14-09-29/numi/nd/000XXX/XXXYY/

where XXX are the first three digits of the run number and YY are the last two. Note, the MC calibrated files mentioned above have "mccalib" in their name. The standard ones do not.

### FD¶

The initial target for this processing is all of the data before the shutdown and after the end of the neutrino hunt. Unless problems are found, back-processing will extend the sample to before the neutrino hunt.

FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.

**NB: The datasets listed below are very large.** Using SAM projects which include the entire dataset is discouraged. For assistance in breaking up the sample, reference the Sam Web Cookbook or email nova_sam@fnal.gov.

The following SAM datasets are available:

Reco art rootfiles: *prod_reco_S14-09-29_fardet_numi_keepup*

CAF, blinded information removed: *prod_caf_S14-09-29_fardet_numi_keepup*

The CAFs can be found here:

/nova/prod/data/keepup/S14-09-29/numi/fd/000XXX/XXXYY/

where XXX are the first three digits of the run number and YY are the last two.

## S14-09-09 Keep-up Reconstruction¶

Keep-up reconstuction provides reconstructed art root files and CAFs. For quick testing the last three files from each of the below datasets is stored on bluearc and updated nightly. They can be found here:

/nova/prod/data/keepup/samples/

### ND¶

Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it. The most notable example is FuzzyKVertex, but the MichelE filters could also be affected. Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.

Near detector reconstruction is also currently lacking channel masks and data quality information.

The following SAM datasets are available.

Reco art rootfiles: *prod_reco_keepup_S14-09-09_neardet_numi_preshutdown_fullgain_cold*

CAF: *prod_caf_keepup_S14-09-09_neardet_numi_preshutdown_fullgain_cold*

The CAFs can also be found on bluearc:

/nova/prod/data/keepup/S14-09-09/numi/nd/000XXX/YYYYY/

where XXX are the first three digits of the run number and YYYYY is the complete run number.

### FD¶

Far detector keep-up processing is still underway. When this effort is complete, the goal is to have processed all of the data after the cease of the neutrino hunt, but before the shutdown. Since the processing is still underway, these datasets will grow.

FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.

The following SAM datasets are available:

Reco art rootfiles: *prod_reco_keepup_S14-09-09_fardet_numi_preshutdown_posthunt*

CAF, blinded information removed: *prod_caf_keepup_S14-09-09_fardet_numi_preshutdown_posthunt*

The CAFs can be found here:

/nova/prod/data/keepup/S14-09-09/numi/fd/000XXX/YYYYY/

where XXX are the first three digits of the run number and YYYYY is the complete run number.

## Special¶

### No Mask ND¶

ND datasets with bad channel masks turned off. Runs included from the full gain, pre-shutdown era (runs 10377 -> 10407)

RECO: **prod_reco_S14-10-15_neardet_numi_fullgain_cold_preshutdown_mccalib-nomask**

PID: **prod_pid_S14-10-28_neardet_numi_fullgain_cold_preshutdown_mccalib-nomask**

CAF: /nova/ana/nu_e_ana/S14-10-28/nd_nomask/data (Bluearc location only)

Only the CAF files contain filled "nuecosrej" variables.

## FD Cosmic Trigger, Summer 2014, Post-Neutrino¶

### Datasets¶

The datasets for these files can be found in SAM. For PID, there is also a small sample living on Bluearc. For CAFs, the entire sample lives on Bluearc. All of the official datasets specify the run range for the the 4db and 7db configurations based on the Stable Partition Milestones wiki page. The datasets also require subruns (files) to pass the data quality cuts.

The SAM datasets and Bluearc locations are as follows:

PID - 4db:

prod_pid_S14-05-12_fardet_cosmics_p1_4db_good

/nova/prod/data/S14-05-12/fd/pid/ (Run numbers less than 14702)

PID - 7db

prod_pid_S14-05-12_fardet_cosmics_p1_7db_good

/nova/prod/data/S14-05-12/fd/pid/ (Run numbers greater than or equal to 14702)

CAF - 4db:

prod_caf_S14-07-11_fardet_cosmics_p1_4db_good

/nova/prod/data/S14-07-11/cosmic/fd/caf/ (Run numbers less than 14702, excluding runs 14563-14594 which were 9 diblock stress tests.)

CAF - 7db

prod_caf_S14-07-11_fardet_cosmics_p1_7db_good

/nova/prod/data/S14-07-11/cosmic/fd/caf/ (Run numbers greater than or equal to 14702)

The underlying reco and pidpart(sans LEM) datasets exist only in SAM:

pidpart - 4db:

prod_pidpart_S14-05-12_fardet_cosmics_p1_4db_good

pidpart - 7db:

prod_pidpart_S14-05-08_fardet_cosmics_p1_7db_good

reco - 4db:

prod_reco_S14-05-08_fardet_cosmics_p1_4db_good

reco - 7db:

prod_reco_S14-05-08_fardet_cosmics_p1_7db_good

### Comments on Livetime¶

In terms of livetime, the 4db sample is much larger than the 7db sample. The 4db PID sample is roughly ten times larger than that used for Neutrino 2014. The four diblock CAF sample is currently twice the size of the Neutrino sample, although it's growing. The CAFMaking process has been tedious due to CAFMaker crashes*, and the sample will never be as large as the PID sample as a result. The 7db sample, in terms of livetime, is roughly the same size as the Neutrino 2014 sample. However, that detector configuration has 75 percent more detector mass. The same caveat applies to the size of the CAF sample.

#### Processing history¶

The processing history for these files is as follows:

Reco: S14-05-08 - Notably, this tag resolves the KalmanTrack association bug

PID: S14-05-12 - Loose nue preselection

CAF: S14-07-11 - Uses LiveGeometry, but gets the CAF metadata right.