Project

General

Profile

Official Datasets » History » Version 60

Matthew Tamsett, 11/24/2014 09:19 AM

1 11 Dominick Rocco
h1. Official Datasets
2 11 Dominick Rocco
3 60 Matthew Tamsett
This page documents the most up to date datasets produced by the production group. Information on older legacy files is kept on the [[NOvA-SAM:LegacyDatasets]] page.
4 1 Gavin Davies
5 60 Matthew Tamsett
h2. Introduction 
6 1 Gavin Davies
7 60 Matthew Tamsett
All modern NOvA data and MC resides within a data handling system called Sequential Access Metadata (SAM).  The files are stored on a tape-backed disk array called dCache.  Locations for the files are tracked using a database with a web interface, accessible through the @samweb@ command line client utility.  The [[NOvA-SAM:SAM_web_cookbook]] provides an introduction to using the system.  All of the datasets listed below are sam "definitions". These are a collection of file meta-data based logical queries which together serve to define a set of files within the SAM system.
8 1 Gavin Davies
9 60 Matthew Tamsett
As a throwback to the old days, some of the data is also stored on bluearc.  CAFs are a notable group of files of which the most recent are almost always stored on bluearc.  There are also small, selected samples of the most recent keep-up (see below) files available on bluearc in:
10 54 Gavin Davies
11 60 Matthew Tamsett
<pre>
12 60 Matthew Tamsett
/nova/prod/data/keepup/samples/ 
13 60 Matthew Tamsett
</pre>
14 1 Gavin Davies
15 1 Gavin Davies
16 60 Matthew Tamsett
Data and MC files are currently provided in two forms: datasets for the *first analysis* (*FA*) and *keep-up* datasets. First analysis dataset are those designed to be used in the first analysis and as such are processed in stable, tested versions of the software including the latest calibration and state-of-the-art reconstruction and PID algorithms. Keep-up datasets, on the other hand, contain data processed as it comes off of the detectors in as close to real time as possible. These datasets are designed to offer users an early look at data and, as such, may not always include the most up to date calibration or reconstruction. All keep-up datasets include "keepup" in their dataset names.
17 1 Gavin Davies
18 60 Matthew Tamsett
First analysis production proceeds in a stepwise manner starting with simulation and progress through reconstruction to PID. At each stage the files from the previous stage are processed to produce a final dataset of simulation, reconstruction or PID files. At the same time a set of the next stage of files are also produced for validation use. At the current time production have *mostly completed the simulation and reconstruction stages* and are producing *PID validation files*.
19 1 Gavin Davies
20 1 Gavin Davies
21 59 Christopher Backhouse
22 60 Matthew Tamsett
h1. Contents
23 59 Christopher Backhouse
24 60 Matthew Tamsett
{{toc}}
25 1 Gavin Davies
26 60 Matthew Tamsett
h1. Data
27 58 Gavin Davies
28 60 Matthew Tamsett
First pass data processing for the first analysis is currently complete through the reconstruction stage, with PID validation files being produced now. The input dataset to these samples is the official first-analysis run lists generated by Ryan Patterson and stored in this directory: 
29 58 Gavin Davies
30 60 Matthew Tamsett
<pre>
31 60 Matthew Tamsett
  /nova/app/users/rbpatter/runlists/
32 60 Matthew Tamsett
</pre>
33 1 Gavin Davies
34 60 Matthew Tamsett
h2. FD
35 1 Gavin Davies
36 60 Matthew Tamsett
| *Data tier* | *Cosmic trigger* | *NuMI trigger* |
37 60 Matthew Tamsett
| *artdaq* | prod_artdaq_fd_cosmic_fa_goodruns | prod_artdaq_fd_numi_fa_goodruns |
38 60 Matthew Tamsett
| *pclist* | prod_pclist_S14-09-29_fd_cosmic_keepup | n/a |
39 60 Matthew Tamsett
| *pcliststop* | prod_pcliststop_S14-09-29_fd_cosmic_keepup | n/a |
40 60 Matthew Tamsett
| *timecal* | prod_timecal_S14-09-29_fd_cosmic_keepup | n/a |
41 60 Matthew Tamsett
| *reco - for analysis*   | prod_reco_FA14-10-28_fd_cosmic_fa_goodruns | prod_reco_FA14-10-28_fd_numi_fa_goodruns |
42 60 Matthew Tamsett
| *reco - keep up* |  | prod_reco_S14-09-29_fardet_numi_keepup |
43 60 Matthew Tamsett
| *CAF - keep up* |  |  prod_caf_S14-09-29_fardet_numi_keepup |
44 1 Gavin Davies
45 60 Matthew Tamsett
h2. ND
46 54 Gavin Davies
47 60 Matthew Tamsett
| *Data tier* | *Cosmic trigger* | *DD activity trigger* | *DD tricell trigger* | *NuMI trigger* | 
48 60 Matthew Tamsett
| *artdaq* | | | | prod_artdaq_FA14-10-03_nd_numi_fullgain_preshutdown_goodruns |
49 60 Matthew Tamsett
| *pclist* | prod_pclist_S14-09-29_nd_cosmic_keepup | prod_pclist_S14-09-29_nd_DDActivity1_keepup | prod_pclist_S14-09-29_nd_DDCalMu_keepup | n/a |
50 60 Matthew Tamsett
| *pcliststop* | prod_pcliststop_S14-09-29_nd_cosmic_keepup | prod_pcliststop_S14-09-29_nd_DDActivity1_keepup | prod_pcliststop_S14-09-29_nd_DDCalMu_keepup | n/a |
51 60 Matthew Tamsett
| *timecal* | prod_timecal_S14-09-29_nd_cosmic_keepup | prod_timecal_S14-09-29_nd_DDActivity1_keepup | prod_timecal_S14-09-29_nd_DDCalMu_keepup | n/a |
52 60 Matthew Tamsett
| *reco - for analysis* | | | | prod_reco_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
53 60 Matthew Tamsett
| *reco - keep up* | | | | prod_reco_S14-09-29_neardet_numi_keepup |
54 60 Matthew Tamsett
| *pid - validation* | | | | prod_pid_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
55 60 Matthew Tamsett
| *caf - validation* | | | | prod_caf_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
56 60 Matthew Tamsett
| *caf - keep up* | | | | prod_caf_S14-09-29_neardet_numi_keepup |
57 4 Gavin Davies
58 60 Matthew Tamsett
h1. Monte Carlo
59 23 Gavin Davies
60 60 Matthew Tamsett
As with data, first pass MC processing for the first analysis is currently complete through the reconstruction stage, with PID validation produced. The only exception to this are the RHC files which are still being simulated.
61 24 Gavin Davies
62 60 Matthew Tamsett
In order to facilitate both first analysis studies and future sensitivity studies two types of MC have been produced. Firstly, those with real detector like configurations. These files are simulated with run numbers (and hence the di-block and active channels configurations of the matching data runs)  and a PoT weighting which replicates that in the first analysis data datasets. Secondly, those with "ideal" 14-DB configurations to be used in future sensitivity studies.
63 6 Gavin Davies
64 60 Matthew Tamsett
h2. FD & ND MC - Real detector-like conditions 
65 54 Gavin Davies
66 60 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | *Cosmics*                       |
67 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap  |   prod_daq_FA14-10-03_fd_genie_fhc_tau  | prod_daq_FA14-10-03_fd_cry_all  |
68 60 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap | prod_reco_FA14-10-28_fd_genie_fhc_nonswap | prod_reco_FA14-10-28_fd_genie_fhc_tau | prod_reco_FA14-10-28_fd_cry_all |
69 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap  | prod_pid_FA14-10-28_fd_genie_fhc_tau  | prod_pid_FA14-10-28_fd_cry_all  |
70 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap  | prod_caf_FA14-10-28_fd_genie_fhc_tau  | prod_caf_FA14-10-28_fd_cry_all  |
71 1 Gavin Davies
72 60 Matthew Tamsett
| *NearDet* | *FHC nonswap*                             | *Cosmics*                       |
73 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_nd_genie_fhc_nonswap |  prod_daq_FA14-10-03_nd_cry_all |
74 60 Matthew Tamsett
| *pclist* | n/a |  prod_pclist_S14-09-29_nd_cry |
75 60 Matthew Tamsett
| *pcliststop* | n/a | prod_pcliststop_S14-09-29_nd_cry |
76 60 Matthew Tamsett
| *timecal* | n/a | prod_timecal_S14-09-29_nd_cry |
77 60 Matthew Tamsett
| *reco - validation*   |  prod_reco_FA14-11-11_nd_genie_nonswap_smallsample | | 
78 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-11-11_nd_genie_nonswap_smallsample | |
79 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-11-11_nd_genie_nonswap_smallsample | |
80 43 Matthew Tamsett
81 60 Matthew Tamsett
h2. FD MC - Ideal conditions (14db) 
82 41 Dominick Rocco
83 60 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | *RHC swap*                                 | *RHC nonswap*                             | *RHC tau*                             | *Cosmics* |
84 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_14db  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_14db  | prod_daq_FA14-10-03_fd_genie_fhc_tau_14db  |  prod_daq_FA14-10-03_fd_genie_rhc_fluxswap_14db  |  prod_daq_FA14-10-03_fd_genie_rhc_nonswap_14db |  prod_daq_FA14-10-03_fd_genie_rhc_tau_14db | |
85 60 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_tau_14db |  | | | |
86 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_14db  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_14db  | prod_pid_FA14-10-28_fd_genie_fhc_tau_14db  |   | | | |
87 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_14db  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_14db  | prod_caf_FA14-10-28_fd_genie_fhc_tau_14db  |  | | | | 
88 41 Dominick Rocco
89 41 Dominick Rocco
90 60 Matthew Tamsett
h1. Supporting sample MC
91 41 Dominick Rocco
92 60 Matthew Tamsett
In addition to the core samples discussed above, some dedicated samples have been produced to study particular effects.
93 41 Dominick Rocco
94 60 Matthew Tamsett
h2. FD MC - Real detector-like conditions, geojittered.
95 41 Dominick Rocco
96 60 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | 
97 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_geojittered  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_geojittered  |prod_daq_FA14-10-03_fd_genie_fhc_tau_geojittered  | 
98 60 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_tau_geojittered |
99 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_geojittered  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_geojittered  | prod_pid_FA14-10-28_fd_genie_fhc_tau_geojittered  | 
100 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_geojittered  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_geojittered  | prod_caf_FA14-10-28_fd_genie_fhc_tau_geojittered  |
101 41 Dominick Rocco
102 41 Dominick Rocco
103 60 Matthew Tamsett
h1.  Processing notes
104 41 Dominick Rocco
105 60 Matthew Tamsett
This section attempts to briefly summarise issues that users should be aware of when using the above datasets. Full details on the releases used can be found on the [[NOvA-ART:History_of_Tagged_Releases]] page.
106 41 Dominick Rocco
107 60 Matthew Tamsett
h2. FA14-10-03 Data raw2root
108 43 Matthew Tamsett
109 60 Matthew Tamsett
This version of the software includes the ND geometry version used in the FA MC simulation.
110 53 Gavin Davies
111 60 Matthew Tamsett
*No known issues*.
112 53 Gavin Davies
113 60 Matthew Tamsett
h2. FA14-10-03 MC Simulation
114 53 Gavin Davies
115 60 Matthew Tamsett
This is the official simulation for the first analysis datasets.
116 53 Gavin Davies
117 60 Matthew Tamsett
*No known issues*.
118 53 Gavin Davies
119 60 Matthew Tamsett
h2. FA14-10-28 FD Reconstruction
120 53 Gavin Davies
121 60 Matthew Tamsett
This is the official reconstruction for the first analysis datasets. It was processed with v04 FD calibrations.
122 47 Dominick Rocco
123 60 Matthew Tamsett
*No known issues*.
124 15 Dominick Rocco
125 60 Matthew Tamsett
h2. FA14-10-28 FD PID and CAF 
126 15 Dominick Rocco
127 60 Matthew Tamsett
These are the PID and CAF validation samples. They are designed so that the physics groups can tune PIDs before the official first analysis production.
128 15 Dominick Rocco
129 60 Matthew Tamsett
*No known issues*.
130 15 Dominick Rocco
131 60 Matthew Tamsett
h2. FA14-11-11 ND reconstruction, PID and CAF 
132 15 Dominick Rocco
133 60 Matthew Tamsett
This release includes the most modern (v05) ND calibrations and represent the official ND reconstruction for the first analysis. The PID and CAF samples are validation samples designed so that the physics groups can tune PIDs before the official first analysis production.
134 15 Dominick Rocco
135 60 Matthew Tamsett
*Known issues*:
136 15 Dominick Rocco
137 60 Matthew Tamsett
 * There is a bug in the calibrated energy for cells in the muon catcher with unphysical W-values whereby these cells receive infinite energies.
138 15 Dominick Rocco
139 60 Matthew Tamsett
h2. S14-09-29 Data keep-up & MC calibration
140 47 Dominick Rocco
141 60 Matthew Tamsett
FD & ND calibration files are currently being produced for the FD cosmic stream as well as the ND DD activity, DD cal mu (tri-cell) and cosmic streams. These samples are constantly topped up using cron jobs. These files were used to produce the v05 ND calibration uses in the latest ND reconstruction files.
142 15 Dominick Rocco
143 60 Matthew Tamsett
*Known issues*:
144 15 Dominick Rocco
145 60 Matthew Tamsett
 * The ND data files have been reconstructed with an old ND geometry.
146 15 Dominick Rocco
147 60 Matthew Tamsett
h2. S14-09-29 Keep-up reconstruction 
148 15 Dominick Rocco
149 60 Matthew Tamsett
Most of the same caveats apply as did for the S14-09-09 reco sample detailed on the [[NOvA-SAM:LegacyDatasets]] page, but will be repeated below for completeness.  The big change in the S14-09-29 is the addition of new information to facilitate basic neutrino searches.  There is a new sel.containment branch in the CAFs and the numu CosRej has been included for the FD stream.  
150 15 Dominick Rocco
151 60 Matthew Tamsett
Some additional notes:
152 60 Matthew Tamsett
 
153 60 Matthew Tamsett
 * The initial target for this processing is all of the data before the shutdown and after the end of the neutrino hunt.  Unless problems are found, back-processing will extend the sample to before the neutrino hunt. 
154 60 Matthew Tamsett
 * Future reco keep-up will proceed in the near future in a modern release and will include post-shutdown (October 2014) data.
155 60 Matthew Tamsett
 * The FD reco keep-up is *blinded*.
156 60 Matthew Tamsett
 * FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.  
157 60 Matthew Tamsett
 * These datasets are very large. Using SAM projects which include the entire dataset is discouraged.  For assistance in breaking up the sample, reference the [[Sam Web Cookbook]] or email nova_sam@fnal.gov.
158 60 Matthew Tamsett
 * Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it.  The most notable example is FuzzyKVertex, but the MichelE filters could also be affected.  Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.  
159 60 Matthew Tamsett
 * Near detector reconstruction is also currently lacking channel masks and data quality information.  
160 15 Dominick Rocco
161 60 Matthew Tamsett
The CAFs can be found here:
162 15 Dominick Rocco
163 15 Dominick Rocco
<pre>
164 15 Dominick Rocco
/nova/prod/data/keepup/S14-09-29/numi/fd/000XXX/XXXYY/ 
165 15 Dominick Rocco
/nova/prod/data/keepup/S14-09-09/numi/nd/000XXX/YYYYY/ 
166 60 Matthew Tamsett
</pre>
167 1 Gavin Davies
168 60 Matthew Tamsett
where XXX are the first three digits of the run number and YY are the last two.