Project

General

Profile

Official Datasets » History » Version 63

Matthew Tamsett, 11/26/2014 03:23 AM
Break FD MC tables with both RHC and FHC into two tables.

1 11 Dominick Rocco
h1. Official Datasets
2 11 Dominick Rocco
3 60 Matthew Tamsett
This page documents the most up to date datasets produced by the production group. Information on older legacy files is kept on the [[NOvA-SAM:LegacyDatasets]] page.
4 1 Gavin Davies
5 60 Matthew Tamsett
h2. Introduction 
6 1 Gavin Davies
7 60 Matthew Tamsett
All modern NOvA data and MC resides within a data handling system called Sequential Access Metadata (SAM).  The files are stored on a tape-backed disk array called dCache.  Locations for the files are tracked using a database with a web interface, accessible through the @samweb@ command line client utility.  The [[NOvA-SAM:SAM_web_cookbook]] provides an introduction to using the system.  All of the datasets listed below are sam "definitions". These are a collection of file meta-data based logical queries which together serve to define a set of files within the SAM system.
8 1 Gavin Davies
9 60 Matthew Tamsett
As a throwback to the old days, some of the data is also stored on bluearc.  CAFs are a notable group of files of which the most recent are almost always stored on bluearc.  There are also small, selected samples of the most recent keep-up (see below) files available on bluearc in:
10 54 Gavin Davies
11 60 Matthew Tamsett
<pre>
12 60 Matthew Tamsett
/nova/prod/data/keepup/samples/ 
13 60 Matthew Tamsett
</pre>
14 1 Gavin Davies
15 1 Gavin Davies
16 60 Matthew Tamsett
Data and MC files are currently provided in two forms: datasets for the *first analysis* (*FA*) and *keep-up* datasets. First analysis dataset are those designed to be used in the first analysis and as such are processed in stable, tested versions of the software including the latest calibration and state-of-the-art reconstruction and PID algorithms. Keep-up datasets, on the other hand, contain data processed as it comes off of the detectors in as close to real time as possible. These datasets are designed to offer users an early look at data and, as such, may not always include the most up to date calibration or reconstruction. All keep-up datasets include "keepup" in their dataset names.
17 1 Gavin Davies
18 60 Matthew Tamsett
First analysis production proceeds in a stepwise manner starting with simulation and progress through reconstruction to PID. At each stage the files from the previous stage are processed to produce a final dataset of simulation, reconstruction or PID files. At the same time a set of the next stage of files are also produced for validation use. At the current time production have *mostly completed the simulation and reconstruction stages* and are producing *PID validation files*.
19 1 Gavin Davies
20 1 Gavin Davies
21 59 Christopher Backhouse
22 60 Matthew Tamsett
h1. Contents
23 59 Christopher Backhouse
24 60 Matthew Tamsett
{{toc}}
25 1 Gavin Davies
26 60 Matthew Tamsett
h1. Data
27 58 Gavin Davies
28 60 Matthew Tamsett
First pass data processing for the first analysis is currently complete through the reconstruction stage, with PID validation files being produced now. The input dataset to these samples is the official first-analysis run lists generated by Ryan Patterson and stored in this directory: 
29 58 Gavin Davies
30 60 Matthew Tamsett
<pre>
31 60 Matthew Tamsett
  /nova/app/users/rbpatter/runlists/
32 60 Matthew Tamsett
</pre>
33 1 Gavin Davies
34 60 Matthew Tamsett
h2. FD
35 1 Gavin Davies
36 60 Matthew Tamsett
| *Data tier* | *Cosmic trigger* | *NuMI trigger* |
37 60 Matthew Tamsett
| *artdaq* | prod_artdaq_fd_cosmic_fa_goodruns | prod_artdaq_fd_numi_fa_goodruns |
38 60 Matthew Tamsett
| *pclist* | prod_pclist_S14-09-29_fd_cosmic_keepup | n/a |
39 60 Matthew Tamsett
| *pcliststop* | prod_pcliststop_S14-09-29_fd_cosmic_keepup | n/a |
40 60 Matthew Tamsett
| *timecal* | prod_timecal_S14-09-29_fd_cosmic_keepup | n/a |
41 60 Matthew Tamsett
| *reco - for analysis*   | prod_reco_FA14-10-28_fd_cosmic_fa_goodruns | prod_reco_FA14-10-28_fd_numi_fa_goodruns |
42 60 Matthew Tamsett
| *reco - keep up* |  | prod_reco_S14-09-29_fardet_numi_keepup |
43 60 Matthew Tamsett
| *CAF - keep up* |  |  prod_caf_S14-09-29_fardet_numi_keepup |
44 1 Gavin Davies
45 60 Matthew Tamsett
h2. ND
46 54 Gavin Davies
47 60 Matthew Tamsett
| *Data tier* | *Cosmic trigger* | *DD activity trigger* | *DD tricell trigger* | *NuMI trigger* | 
48 60 Matthew Tamsett
| *artdaq* | | | | prod_artdaq_FA14-10-03_nd_numi_fullgain_preshutdown_goodruns |
49 60 Matthew Tamsett
| *pclist* | prod_pclist_S14-09-29_nd_cosmic_keepup | prod_pclist_S14-09-29_nd_DDActivity1_keepup | prod_pclist_S14-09-29_nd_DDCalMu_keepup | n/a |
50 60 Matthew Tamsett
| *pcliststop* | prod_pcliststop_S14-09-29_nd_cosmic_keepup | prod_pcliststop_S14-09-29_nd_DDActivity1_keepup | prod_pcliststop_S14-09-29_nd_DDCalMu_keepup | n/a |
51 60 Matthew Tamsett
| *timecal* | prod_timecal_S14-09-29_nd_cosmic_keepup | prod_timecal_S14-09-29_nd_DDActivity1_keepup | prod_timecal_S14-09-29_nd_DDCalMu_keepup | n/a |
52 60 Matthew Tamsett
| *reco - for analysis* | | | | prod_reco_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
53 60 Matthew Tamsett
| *reco - keep up* | | | | prod_reco_S14-09-29_neardet_numi_keepup |
54 60 Matthew Tamsett
| *pid - validation* | | | | prod_pid_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
55 60 Matthew Tamsett
| *caf - validation* | | | | prod_caf_FA14-11-11_nd_numi_fullgain_preshutdown_goodruns |
56 60 Matthew Tamsett
| *caf - keep up* | | | | prod_caf_S14-09-29_neardet_numi_keepup |
57 4 Gavin Davies
58 60 Matthew Tamsett
h1. Monte Carlo
59 23 Gavin Davies
60 60 Matthew Tamsett
As with data, first pass MC processing for the first analysis is currently complete through the reconstruction stage, with PID validation produced. The only exception to this are the RHC files which are still being simulated.
61 24 Gavin Davies
62 60 Matthew Tamsett
In order to facilitate both first analysis studies and future sensitivity studies two types of MC have been produced. Firstly, those with real detector like configurations. These files are simulated with run numbers (and hence the di-block and active channels configurations of the matching data runs)  and a PoT weighting which replicates that in the first analysis data datasets. Secondly, those with "ideal" 14-DB configurations to be used in future sensitivity studies.
63 6 Gavin Davies
64 60 Matthew Tamsett
h2. FD & ND MC - Real detector-like conditions 
65 54 Gavin Davies
66 60 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | *Cosmics*                       |
67 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap  |   prod_daq_FA14-10-03_fd_genie_fhc_tau  | prod_daq_FA14-10-03_fd_cry_all  |
68 60 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap | prod_reco_FA14-10-28_fd_genie_fhc_nonswap | prod_reco_FA14-10-28_fd_genie_fhc_tau | prod_reco_FA14-10-28_fd_cry_all |
69 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap  | prod_pid_FA14-10-28_fd_genie_fhc_tau  | prod_pid_FA14-10-28_fd_cry_all  |
70 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap  | prod_caf_FA14-10-28_fd_genie_fhc_tau  | prod_caf_FA14-10-28_fd_cry_all  |
71 1 Gavin Davies
72 60 Matthew Tamsett
| *NearDet* | *FHC nonswap*                             | *Cosmics*                       |
73 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_nd_genie_fhc_nonswap |  prod_daq_FA14-10-03_nd_cry_all |
74 60 Matthew Tamsett
| *pclist* | n/a |  prod_pclist_S14-09-29_nd_cry |
75 60 Matthew Tamsett
| *pcliststop* | n/a | prod_pcliststop_S14-09-29_nd_cry |
76 60 Matthew Tamsett
| *timecal* | n/a | prod_timecal_S14-09-29_nd_cry |
77 60 Matthew Tamsett
| *reco - validation*   |  prod_reco_FA14-11-11_nd_genie_nonswap_smallsample | | 
78 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-11-11_nd_genie_nonswap_smallsample | |
79 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-11-11_nd_genie_nonswap_smallsample | |
80 43 Matthew Tamsett
81 60 Matthew Tamsett
h2. FD MC - Ideal conditions (14db) 
82 41 Dominick Rocco
83 63 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | 
84 63 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_14db  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_14db  | prod_daq_FA14-10-03_fd_genie_fhc_tau_14db  |  
85 63 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_14db | prod_reco_FA14-10-28_fd_genie_fhc_tau_14db |  
86 63 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_14db  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_14db  | prod_pid_FA14-10-28_fd_genie_fhc_tau_14db  | 
87 63 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_14db  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_14db  | prod_caf_FA14-10-28_fd_genie_fhc_tau_14db  |
88 60 Matthew Tamsett
89 63 Matthew Tamsett
| *FarDet continued* | *RHC swap*                                 | *RHC nonswap*                             | *RHC tau*                             | *Cosmics* |
90 63 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_rhc_fluxswap_14db  |  prod_daq_FA14-10-03_fd_genie_rhc_nonswap_14db |  prod_daq_FA14-10-03_fd_genie_rhc_tau_14db |
91 41 Dominick Rocco
92 60 Matthew Tamsett
h1. Supporting sample MC
93 41 Dominick Rocco
94 60 Matthew Tamsett
In addition to the core samples discussed above, some dedicated samples have been produced to study particular effects.
95 41 Dominick Rocco
96 60 Matthew Tamsett
h2. FD MC - Real detector-like conditions, geojittered.
97 41 Dominick Rocco
98 60 Matthew Tamsett
| *FarDet* | *FHC swap*                                 | *FHC nonswap*                             | *FHC tau*                             | 
99 60 Matthew Tamsett
| *artdaq* | prod_daq_FA14-10-03_fd_genie_fhc_fluxswap_geojittered  | prod_daq_FA14-10-03_fd_genie_fhc_nonswap_geojittered  |prod_daq_FA14-10-03_fd_genie_fhc_tau_geojittered  | 
100 60 Matthew Tamsett
| *reco*   | prod_reco_FA14-10-28_fd_genie_fhc_fluxswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_nonswap_geojittered | prod_reco_FA14-10-28_fd_genie_fhc_tau_geojittered |
101 60 Matthew Tamsett
| *pid - validation*    | prod_pid_FA14-10-28_fd_genie_fhc_fluxswap_geojittered  | prod_pid_FA14-10-28_fd_genie_fhc_nonswap_geojittered  | prod_pid_FA14-10-28_fd_genie_fhc_tau_geojittered  | 
102 60 Matthew Tamsett
| *caf - validation*    | prod_caf_FA14-10-28_fd_genie_fhc_fluxswap_geojittered  | prod_caf_FA14-10-28_fd_genie_fhc_nonswap_geojittered  | prod_caf_FA14-10-28_fd_genie_fhc_tau_geojittered  |
103 41 Dominick Rocco
104 41 Dominick Rocco
105 60 Matthew Tamsett
h1.  Processing notes
106 41 Dominick Rocco
107 60 Matthew Tamsett
This section attempts to briefly summarise issues that users should be aware of when using the above datasets. Full details on the releases used can be found on the [[NOvA-ART:History_of_Tagged_Releases]] page.
108 41 Dominick Rocco
109 60 Matthew Tamsett
h2. FA14-10-03 Data raw2root
110 43 Matthew Tamsett
111 60 Matthew Tamsett
This version of the software includes the ND geometry version used in the FA MC simulation.
112 53 Gavin Davies
113 60 Matthew Tamsett
*No known issues*.
114 53 Gavin Davies
115 60 Matthew Tamsett
h2. FA14-10-03 MC Simulation
116 53 Gavin Davies
117 60 Matthew Tamsett
This is the official simulation for the first analysis datasets.
118 53 Gavin Davies
119 60 Matthew Tamsett
*No known issues*.
120 53 Gavin Davies
121 60 Matthew Tamsett
h2. FA14-10-28 FD Reconstruction
122 53 Gavin Davies
123 60 Matthew Tamsett
This is the official reconstruction for the first analysis datasets. It was processed with v04 FD calibrations.
124 47 Dominick Rocco
125 60 Matthew Tamsett
*No known issues*.
126 15 Dominick Rocco
127 60 Matthew Tamsett
h2. FA14-10-28 FD PID and CAF 
128 15 Dominick Rocco
129 60 Matthew Tamsett
These are the PID and CAF validation samples. They are designed so that the physics groups can tune PIDs before the official first analysis production.
130 15 Dominick Rocco
131 60 Matthew Tamsett
*No known issues*.
132 15 Dominick Rocco
133 61 Matthew Tamsett
The CAFs can be found here:
134 61 Matthew Tamsett
135 61 Matthew Tamsett
<pre>
136 61 Matthew Tamsett
/nova/prod/mc/FA14-10-28/genie/fd/caf/
137 61 Matthew Tamsett
/nova/prod/mc/FA14-10-28/cry/fd/caf/
138 61 Matthew Tamsett
</pre>
139 61 Matthew Tamsett
140 62 Matthew Tamsett
The real-detector configurations CAFs live in sub-directories with numbers 000129-000170, and the "ideal" MC in sub-directory 010000. Note the real detector configurations genie folders contain both the baseline MC and the geojittered MC, so anyone getting files from this location should be sure to require the "genie_fhc" string be contained in their CAF file name if they want to study the baseline sample and similarly that "geojittered" be there if they want to study that sample. 
141 61 Matthew Tamsett
142 15 Dominick Rocco
h2. FA14-11-11 ND reconstruction, PID and CAF 
143 1 Gavin Davies
144 61 Matthew Tamsett
This release includes the most modern (v05) ND calibrations and represent the official ND reconstruction for the first analysis. The PID and CAF samples are validation samples designed so that the physics groups can tune PIDs before the official first analysis production. Only a subsample of all MC events were processed through reconstruction, PID and CAF (~7M / 34M events) at the moment. This sample will not be topped up due to the issues discussed below, however it will be superseded in the near future.
145 15 Dominick Rocco
146 60 Matthew Tamsett
*Known issues*:
147 15 Dominick Rocco
148 1 Gavin Davies
 * There is a bug in the calibrated energy for cells in the muon catcher with unphysical W-values whereby these cells receive infinite energies.
149 61 Matthew Tamsett
150 61 Matthew Tamsett
The CAFs can be found here:
151 61 Matthew Tamsett
152 61 Matthew Tamsett
<pre>
153 61 Matthew Tamsett
/nova/prod/mc/FA14-11-11/genie/nd/caf/
154 61 Matthew Tamsett
</pre>
155 15 Dominick Rocco
156 60 Matthew Tamsett
h2. S14-09-29 Data keep-up & MC calibration
157 47 Dominick Rocco
158 60 Matthew Tamsett
FD & ND calibration files are currently being produced for the FD cosmic stream as well as the ND DD activity, DD cal mu (tri-cell) and cosmic streams. These samples are constantly topped up using cron jobs. These files were used to produce the v05 ND calibration uses in the latest ND reconstruction files.
159 15 Dominick Rocco
160 60 Matthew Tamsett
*Known issues*:
161 15 Dominick Rocco
162 60 Matthew Tamsett
 * The ND data files have been reconstructed with an old ND geometry.
163 15 Dominick Rocco
164 60 Matthew Tamsett
h2. S14-09-29 Keep-up reconstruction 
165 15 Dominick Rocco
166 60 Matthew Tamsett
Most of the same caveats apply as did for the S14-09-09 reco sample detailed on the [[NOvA-SAM:LegacyDatasets]] page, but will be repeated below for completeness.  The big change in the S14-09-29 is the addition of new information to facilitate basic neutrino searches.  There is a new sel.containment branch in the CAFs and the numu CosRej has been included for the FD stream.  
167 15 Dominick Rocco
168 60 Matthew Tamsett
Some additional notes:
169 60 Matthew Tamsett
 
170 60 Matthew Tamsett
 * The initial target for this processing is all of the data before the shutdown and after the end of the neutrino hunt.  Unless problems are found, back-processing will extend the sample to before the neutrino hunt. 
171 60 Matthew Tamsett
 * Future reco keep-up will proceed in the near future in a modern release and will include post-shutdown (October 2014) data.
172 60 Matthew Tamsett
 * The FD reco keep-up is *blinded*.
173 60 Matthew Tamsett
 * FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.  
174 60 Matthew Tamsett
 * These datasets are very large. Using SAM projects which include the entire dataset is discouraged.  For assistance in breaking up the sample, reference the [[Sam Web Cookbook]] or email nova_sam@fnal.gov.
175 60 Matthew Tamsett
 * Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it.  The most notable example is FuzzyKVertex, but the MichelE filters could also be affected.  Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.  
176 60 Matthew Tamsett
 * Near detector reconstruction is also currently lacking channel masks and data quality information.  
177 15 Dominick Rocco
178 60 Matthew Tamsett
The CAFs can be found here:
179 15 Dominick Rocco
180 15 Dominick Rocco
<pre>
181 15 Dominick Rocco
/nova/prod/data/keepup/S14-09-29/numi/fd/000XXX/XXXYY/ 
182 15 Dominick Rocco
/nova/prod/data/keepup/S14-09-09/numi/nd/000XXX/YYYYY/ 
183 60 Matthew Tamsett
</pre>
184 1 Gavin Davies
185 60 Matthew Tamsett
where XXX are the first three digits of the run number and YY are the last two.