Official Datasets » History » Version 60
Matthew Tamsett, 11/24/2014 09:19 AM
h1. Official Datasets
This page documents the most up to date datasets produced by the production group. Information on older legacy files is kept on the [[NOvA-SAM:LegacyDatasets]] page.
All modern NOvA data and MC resides within a data handling system called Sequential Access Metadata (SAM). The files are stored on a tape-backed disk array called dCache. Locations for the files are tracked using a database with a web interface, accessible through the @samweb@ command line client utility. The [[NOvA-SAM:SAM_web_cookbook]] provides an introduction to using the system. All of the datasets listed below are sam "definitions". These are a collection of file meta-data based logical queries which together serve to define a set of files within the SAM system.
As a throwback to the old days, some of the data is also stored on bluearc. CAFs are a notable group of files of which the most recent are almost always stored on bluearc. There are also small, selected samples of the most recent keep-up (see below) files available on bluearc in:
Data and MC files are currently provided in two forms: datasets for the *first analysis* (*FA*) and *keep-up* datasets. First analysis dataset are those designed to be used in the first analysis and as such are processed in stable, tested versions of the software including the latest calibration and state-of-the-art reconstruction and PID algorithms. Keep-up datasets, on the other hand, contain data processed as it comes off of the detectors in as close to real time as possible. These datasets are designed to offer users an early look at data and, as such, may not always include the most up to date calibration or reconstruction. All keep-up datasets include "keepup" in their dataset names.
First analysis production proceeds in a stepwise manner starting with simulation and progress through reconstruction to PID. At each stage the files from the previous stage are processed to produce a final dataset of simulation, reconstruction or PID files. At the same time a set of the next stage of files are also produced for validation use. At the current time production have *mostly completed the simulation and reconstruction stages* and are producing *PID validation files*.
First pass data processing for the first analysis is currently complete through the reconstruction stage, with PID validation files being produced now. The input dataset to these samples is the official first-analysis run lists generated by Ryan Patterson and stored in this directory:
| *Data tier* | *Cosmic trigger* | *NuMI trigger* |
| *artdaq* | prod_artdaq_fd_cosmic_fa_goodruns | prod_artdaq_fd_numi_fa_goodruns |
| *pclist* | prod_pclist_S14-09-29_fd_cosmic_keepup | n/a |
| *pcliststop* | prod_pcliststop_S14-09-29_fd_cosmic_keepup | n/a |
| *timecal* | prod_timecal_S14-09-29_fd_cosmic_keepup | n/a |
| *reco - for analysis* | prod_reco_FA14-10-28_fd_cosmic_fa_goodruns | prod_reco_FA14-10-28_fd_numi_fa_goodruns |
| *reco - keep up* | | prod_reco_S14-09-29_fardet_numi_keepup |
| *CAF - keep up* | | prod_caf_S14-09-29_fardet_numi_keepup |
| *Data tier* | *Cosmic trigger* | *DD activity trigger* | *DD tricell trigger* | *NuMI trigger* |
| *artdaq* | | | | prod_artdaq_FA14-10-03_nd_numi_fullgain_preshutdown_goodruns |
| *pclist* | prod_pclist_S14-09-29_nd_cosmic_keepup | prod_pclist_S14-09-29_nd_DDActivity1_keepup | prod_pclist_S14-09-29_nd_DDCalMu_keepup | n/a |
| *pcliststop* | prod_pcliststop_S14-09-29_nd_cosmic_keepup | prod_pcliststop_S14-09-29_nd_DDActivity1_keepup | prod_pcliststop_S14-09-29_nd_DDCalMu_keepup | n/a |
| *timecal* | prod_timecal_S14-09-29_nd_cosmic_keepup | prod_timecal_S14-09-29_nd_DDActivity1_keepup | prod_timecal_S14-09-29_nd_DDCalMu_keepup | n/a |
| *reco - for analysis* | | | | prod_reco_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
| *reco - keep up* | | | | prod_reco_S14-09-29_neardet_numi_keepup |
| *pid - validation* | | | | prod_pid_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
| *caf - validation* | | | | prod_caf_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
| *caf - keep up* | | | | prod_caf_S14-09-29_neardet_numi_keepup |
h1. Monte Carlo
As with data, first pass MC processing for the first analysis is currently complete through the reconstruction stage, with PID validation produced. The only exception to this are the RHC files which are still being simulated.
In order to facilitate both first analysis studies and future sensitivity studies two types of MC have been produced. Firstly, those with real detector like configurations. These files are simulated with run numbers (and hence the di-block and active channels configurations of the matching data runs) and a PoT weighting which replicates that in the first analysis data datasets. Secondly, those with "ideal" 14-DB configurations to be used in future sensitivity studies.
h2. FD & ND MC - Real detector-like conditions
| *FarDet* | *FHC swap* | *FHC nonswap* | *FHC tau* | *Cosmics* |
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap | prod_daq_FA14-10-03_fd_genie_fhc_nonswap | prod_daq_FA14-10-03_fd_genie_fhc_tau | prod_daq_FA14-10-03_fd_cry_all |
| *reco* | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap | prod_reco_FA14-10-28_fd_genie_fhc_nonswap | prod_reco_FA14-10-28_fd_genie_fhc_tau | prod_reco_FA14-10-28_fd_cry_all |
| *pid - validation* | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap | prod_pid_FA14-10-28_fd_genie_fhc_nonswap | prod_pid_FA14-10-28_fd_genie_fhc_tau | prod_pid_FA14-10-28_fd_cry_all |
| *caf - validation* | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap | prod_caf_FA14-10-28_fd_genie_fhc_nonswap | prod_caf_FA14-10-28_fd_genie_fhc_tau | prod_caf_FA14-10-28_fd_cry_all |
| *NearDet* | *FHC nonswap* | *Cosmics* |
| *artdaq* | prod_daq_FA14-10-03_nd_genie_fhc_nonswap | prod_daq_FA14-10-03_nd_cry_all |
| *pclist* | n/a | prod_pclist_S14-09-29_nd_cry |
| *pcliststop* | n/a | prod_pcliststop_S14-09-29_nd_cry |
| *timecal* | n/a | prod_timecal_S14-09-29_nd_cry |
| *reco - validation* | prod_reco_FA14-11-11_nd_genie_nonswap_smallsample | |
| *pid - validation* | prod_pid_FA14-11-11_nd_genie_nonswap_smallsample | |
| *caf - validation* | prod_caf_FA14-11-11_nd_genie_nonswap_smallsample | |
h2. FD MC - Ideal conditions (14db)
| *FarDet* | *FHC swap* | *FHC nonswap* | *FHC tau* | *RHC swap* | *RHC nonswap* | *RHC tau* | *Cosmics* |
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_14db | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_14db | prod_daq_FA14-10-03_fd_genie_fhc_tau_14db | prod_daq_FA14-10-03_fd_genie_rhc_fluxswap_14db | prod_daq_FA14-10-03_fd_genie_rhc_nonswap_14db | prod_daq_FA14-10-03_fd_genie_rhc_tau_14db | |
| *reco* | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_tau_14db | | | | |
| *pid - validation* | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_pid_FA14-10-28_fd_genie_fhc_tau_14db | | | | |
| *caf - validation* | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_caf_FA14-10-28_fd_genie_fhc_tau_14db | | | | |
h1. Supporting sample MC
In addition to the core samples discussed above, some dedicated samples have been produced to study particular effects.
h2. FD MC - Real detector-like conditions, geojittered.
| *FarDet* | *FHC swap* | *FHC nonswap* | *FHC tau* |
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_geojittered | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_geojittered |prod_daq_FA14-10-03_fd_genie_fhc_tau_geojittered |
| *reco* | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_tau_geojittered |
| *pid - validation* | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_geojittered | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_geojittered | prod_pid_FA14-10-28_fd_genie_fhc_tau_geojittered |
| *caf - validation* | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_geojittered | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_geojittered | prod_caf_FA14-10-28_fd_genie_fhc_tau_geojittered |
h1. Processing notes
This section attempts to briefly summarise issues that users should be aware of when using the above datasets. Full details on the releases used can be found on the [[NOvA-ART:History_of_Tagged_Releases]] page.
h2. FA14-10-03 Data raw2root
This version of the software includes the ND geometry version used in the FA MC simulation.
*No known issues*.
h2. FA14-10-03 MC Simulation
This is the official simulation for the first analysis datasets.
*No known issues*.
h2. FA14-10-28 FD Reconstruction
This is the official reconstruction for the first analysis datasets. It was processed with v04 FD calibrations.
*No known issues*.
h2. FA14-10-28 FD PID and CAF
These are the PID and CAF validation samples. They are designed so that the physics groups can tune PIDs before the official first analysis production.
*No known issues*.
h2. FA14-11-11 ND reconstruction, PID and CAF
This release includes the most modern (v05) ND calibrations and represent the official ND reconstruction for the first analysis. The PID and CAF samples are validation samples designed so that the physics groups can tune PIDs before the official first analysis production.
* There is a bug in the calibrated energy for cells in the muon catcher with unphysical W-values whereby these cells receive infinite energies.
h2. S14-09-29 Data keep-up & MC calibration
FD & ND calibration files are currently being produced for the FD cosmic stream as well as the ND DD activity, DD cal mu (tri-cell) and cosmic streams. These samples are constantly topped up using cron jobs. These files were used to produce the v05 ND calibration uses in the latest ND reconstruction files.
* The ND data files have been reconstructed with an old ND geometry.
h2. S14-09-29 Keep-up reconstruction
Most of the same caveats apply as did for the S14-09-09 reco sample detailed on the [[NOvA-SAM:LegacyDatasets]] page, but will be repeated below for completeness. The big change in the S14-09-29 is the addition of new information to facilitate basic neutrino searches. There is a new sel.containment branch in the CAFs and the numu CosRej has been included for the FD stream.
Some additional notes:
* The initial target for this processing is all of the data before the shutdown and after the end of the neutrino hunt. Unless problems are found, back-processing will extend the sample to before the neutrino hunt.
* Future reco keep-up will proceed in the near future in a modern release and will include post-shutdown (October 2014) data.
* The FD reco keep-up is *blinded*.
* FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.
* These datasets are very large. Using SAM projects which include the entire dataset is discouraged. For assistance in breaking up the sample, reference the [[Sam Web Cookbook]] or email email@example.com.
* Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it. The most notable example is FuzzyKVertex, but the MichelE filters could also be affected. Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.
* Near detector reconstruction is also currently lacking channel masks and data quality information.
The CAFs can be found here:
where XXX are the first three digits of the run number and YY are the last two.