Official Datasets » History » Version 53
« Previous -
Version 53/81
(diff) -
Next » -
Current version
Gavin Davies, 11/05/2014 03:06 PM
add ND Data/MC "nomask" sets
Official Datasets¶
The vast majority of NOvA's data and MC reside within a data handling system called Sequential Access Metadata (SAM). The files are stored on a tape-backed disk array called dCache. Locations for the files are tracked using a database with a web interface, accessible through the samweb
command line client utility. The SAM_web_cookbook provides an introduction to using the system.
As a throwback to the old days, some of the data is still stored on bluearc. CAFs are a notable group of files of which the most recent are almost always stored on bluearc. There are also small, selected samples of ART files available on bluearc in <some location, documented where?>
- Table of contents
- Official Datasets
- Monte Carlo
- Latest datasets for regular configurations
- GENIE
- CRY
- DATA
Monte Carlo¶
Latest datasets for regular configurations¶
FarDet | FHC swap | FHC nonswap | RHC swap | RHC nonswap | Cosmics |
artdaq | prod_artdaq_S14-03-25_fd_genie_fhc_swap | prod_artdaq_S14-03-25_fd_genie_fhc_nonswap | prod_artdaq_S14-03-25_fd_genie_rhc_swap | prod_artdaq_S14-03-25_fd_genie_rhc_nonswap | prodartdaq_S14-02-05_fdcry |
reco | prodreco_S14-03-25_FDGENIE_fhc_swap | prodreco_S14-03-25_FDGENIE_fhc_nonswap | prodreco_S14-03-25_FDGENIE_rhc_swap | prodreco_S14-03-25_FDGENIE_rhc_nonswap | prodreco_S14-03-25_FDCRY |
pid | prod_pid_S14-05-12_fd_genie_neutrino2014 | ||||
caf | prod_caf_S14-07-11_fd_genie_neutrino2014 |
NearDet | FHC | RHC | Cosmics |
artdaq | prodartdaq_S14-05-12_ndgenie_fix | - | prodartdaq_S14-05-12_ndcry |
reco | prodreco_S14-08-01_ndgenie_fix | - | - |
pid | prodpid_S14-08-19_ndgenie_fix | - | - |
caf | prodcaf_S14-08-19_ndgenie_fix | - | - |
GENIE¶
ARTDAQ¶
FD¶
FHC SWAP/NONSWAP
prod_artdaq_S14-03-25_fd_genie_fhc_swap (709)
prod_artdaq_S14-03-25_fd_genie_fhc_nonswap (709)
RHC SWAP/NONSWAP
prod_artdaq_S14-03-25_fd_genie_rhc_swap (709)
prod_artdaq_S14-03-25_fd_genie_rhc_nonswap (709)
ND¶
Fixed version of overlay files
prodartdaq_S14-05-12_ndgenie_fix (4359)
prodartdaq_S14-05-12_ndgenie (20087)
Special¶
Nominal geometry
prod_artdaq_S14-08-19_fd_genie_fhc_nonswap (34)
prod_artdaq_S14-08-19_fd_genie_fhc_swap (34)
Jittered geometry
prod_artdaq_S14-08-19_fd_fhc_nonswap_geojittered (37)
prod_artdaq_S14-08-19_fd_fhc_swap_geojittered (39)
CRY+GENIE
prod_artdaq_S14-08-19_fd_fhc_nonswap_crygenie (510)
prod_artdaq_S14-08-19_fd_fhc_swap_crygenie (511)
RECO¶
Official production datasets containing all reconstructed files
FD¶
ALL FD GENIE RECO:
prodreco_S14-03-25_FDGENIE (2807 files)
FHC SWAP/NONSWAP:
prodreco_S14-03-25_FDGENIE_fhc_swap (703)
prodreco_S14-03-25_FDGENIE_fhc_nonswap (703)
RHC SWAP/NONSWAP
prodreco_S14-03-25_FDGENIE_rhc_swap (698)
prodreco_S14-03-25_FDGENIE_rhc_nonswap (703)
ND¶
ND (FHC) GENIE (rock+detector) OVERLAY RECO
prodreco_S14-08-01_ndgenie_fix (4295)
prodreco_S14-07-03_ndgenie (20057 and increasing)
Special¶
Nominal geometry
prod_reco_S14-08-01_fd_genie_fhc_nonswap (34)
prod_reco_S14-08-01_fd_genie_fhc_swap (34)
Jittered geometry
Simulated with jitter and reco+ with standard geom
prod_reco_S14-08-01_fd_fhc_nonswap_geojittered_v5 (37)
prod_reco_S14-08-01_fd_fhc_swap_geojittered_v5 (39)
All tiers with geojittered geometryprod_reco_S14-08-01_fd_fhc_nonswap_geojittered_v1 (34)prod_reco_S14-08-01_fd_fhc_swap_geojittered_v1 (34)
CRY+GENIE
prod_reco_S14-08-01_fd_fhc_nonswap_crygenie (510)
prod_reco_S14-08-01_fd_fhc_swap_crygenie (511)
NO MASK
These files have bad channels turned off.
prod_reco_S14-10-15_genie_nonswap_nomask
Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.
PID¶
FD¶
prod_pid_S14-05-12_fd_genie_neutrino2014
ND¶
ND (FHC) GENIE (rock+detector) OVERLAY reco
prodpid_S14-08-19_ndgenie_fix (4072)
prodpid_S14-08-01_ndgenie (2715, will grow)
Special¶
Nominal
prod_pid_S14-08-19_fd_genie_fhc_nonswap (34)
prod_pid_S14-08-19_fd_genie_fhc_swap (34)
Jittered geometry
Simulated with jitter and reco+ with standard geom
prod_pid_S14-08-19_fd_fhc_nonswap_geojittered_v5 (37)
prod_pid_S14-08-19_fd_fhc_swap_geojittered_v5 (39)
All tiers with geojittered geometryprod_pid_S14-08-19_fd_fhc_nonswap_geojittered_v1 (34)prod_pid_S14-08-19_fd_fhc_swap_geojittered_v1 (34)
CRY+GENIE
prod_pid_S14-08-19_fd_fhc_nonswap_crygenie (486)
prod_pid_S14-08-19_fd_fhc_swap_crygenie (492)
NO MASK
These files have bad channels turned off.
prod_pid_S14-10-28_nd_genie_nonswap_nomask
Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.
The module "nuecosrej" is empty in the PID files. It is filled at the CAF stage for this set however.
CAF¶
FD¶
prod_caf_S14-07-11_fd_genie_neutrino2014
ND¶
SAM: prodcaf_S14-08-19_ndgenie_fix (4072)
Bluearc: /nova/prod/mc/S14-08-19/genie/nd/caf (4072)
First pass post-pid:/nova/prod/mc/S14-08-15/genie/nd/caf/ (2441 files)
(find . -name "neardet*")
CAFMaker Floating point exception for all failures.
Caveat: Two significant issues have been found in this sample in the last day or two. An old version of PhotonTransport was used in the overlay production, making the light levels low by a factor of two, and also the rock overlay scheme re-used rock events at a much higher rate than intended. Both of these issues are fixed and replacements for these ND MC CAFs will tentatively be ready in a week we hope. We'll keep you posted.
The current CAFs should be useful for code development and certain physics studies in the meantime.
Special¶
Nominal
prod_caf_S14-08-19_fd_genie_fhc_nonswap (34)
prod_caf_S14-08-19_fd_genie_fhc_swap (34)
Jittered geometry
Simulated with jitter and reco+ with standard geom
prod_caf_S14-08-19_fd_fhc_nonswap_geojittered_v5 (37)
prod_caf_S14-08-19_fd_fhc_swap_geojittered_v5 (39)
All tiers with geojittered geometryprod_caf_S14-08-19_fd_fhc_nonswap_geojittered_v1 (32)prod_caf_S14-08-19_fd_fhc_swap_geojittered_v1 (33)
CRY+GENIE
prod_caf_S14-08-19_fd_fhc_nonswap_crygenie (484)
prod_caf_S14-08-19_fd_fhc_swap_crygenie (490)
Bluearc:
/nova/prod/mc/S14-08-19/genie/fd/caf/
(need to grab fhc_nonswap or fhc_swap plus either crygenie or geojittered*v5)
NO MASK
These files have bad channels turned off.
/nova/ana/nu_e_ana/S14-10-28/nd_nomask/mc
Note, these files have the POT counting error that propagates from the generation stage so correctly scaling POT is required, i.e. divide by 2 totalpot.
These CAFs have "nuecosrej" filled.
CRY¶
ARTDAQ¶
ND CRY COSMICS:
prodartdaq_S14-05-12_ndcry
FD CRY COSMICS:
prodartdaq_S14-02-05_fdcry
RECO¶
FD CRY COSMICS:
prodreco_S14-03-25_FDCRY (10069)
CALIBRATION (pchitlists)¶
ND CRY COSMICS:
prodcalib_S14-05-12_ndcry_pclist
prodcalib_S14-05-12_ndcry_pcliststop
FD CRY COSMICS:
prodcalib_S14-03-05_FDCry_pclist
prodcalib_S14-03-05_FDCry_pcliststop
DATA¶
First-Analysis -- First Pass FD Processing (FA14-10-28)¶
In advance of the final calibration intended for first-analysis, a first pass with old calibration constants will be produced. The input to these samples is the official first-analysis run lists generated by Ryan Patterson and stored in this directory:
/nova/app/users/rbpatter/runlists/Those lists have been adapted into very specific datasets.
Data Tier | Cosmic Trigger | NuMI Trigger |
artdaq | prod_artdaq_fd_cosmic_fa_goodruns |
prod_artdaq_fd_numi_fa_goodruns |
reco | prod_reco_FA14-10-28_fd_cosmic_fa_goodruns |
prod_reco_FA14-10-28_fd_numi_fa_goodruns |
S14-09-29 Keep-up Reconstruction¶
Most of the same caveats apply as did for the S14-09-09 sample, but will be repeated below for completeness. The big change in the S14-09-29 is the addition of new information to facilitate basic neutrino searches. There is a new sel.containment branch in the CAFs and the numu CosRej has been included for the FD stream.
Keep-up reconstuction provides reconstructed art root files and CAFs. For quick testing the last three files from each of the below datasets is stored on bluearc and updated nightly. They can be found here:
/nova/prod/data/keepup/samples/
ND with MC Calibration¶
In lieu of official ND calibration, the old MC calibration can be applied. The output of any module which relies on calibration should be taken with a heavy grain of salt, but these files will contain non-zero values for calibrated energies rather than zeroes.
The database has now been populated with bad channel masks, which have been applied in this round of processing.
The following SAM datasets are available.
Reco art rootfiles: prod_reco_S14-09-29_neardet_numi_keepup_mccalib
CAF: prod_caf_S14-09-29_neardet_numi_keepup_mccalib
The CAFs can also be found on bluearc:
/nova/prod/data/keepup/S14-09-29/numi/nd/000XXX/XXXYY/
where XXX are the first three digits of the run number and YY are the last two. Note, the MC calibrated files have "mccalib" in their name. The standard ones do not.
As of October 16, there is an FTS issue keeping these files from being transferred to bluearc. The CD experts have been notified and are working on a solution.
ND (no calibration)¶
Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it. The most notable example is FuzzyKVertex, but the MichelE filters could also be affected. Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.
Near detector reconstruction is also currently lacking channel masks and data quality information.
The following SAM datasets are available.
Reco art rootfiles: prod_reco_S14-09-29_neardet_numi_keepup
CAF: prod_caf_S14-09-29_neardet_numi_keepup
The CAFs can also be found on bluearc:
/nova/prod/data/keepup/S14-09-29/numi/nd/000XXX/XXXYY/
where XXX are the first three digits of the run number and YY are the last two. Note, the MC calibrated files mentioned above have "mccalib" in their name. The standard ones do not.
FD¶
The initial target for this processing is all of the data before the shutdown and after the end of the neutrino hunt. Unless problems are found, back-processing will extend the sample to before the neutrino hunt.
FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.
NB: The datasets listed below are very large. Using SAM projects which include the entire dataset is discouraged. For assistance in breaking up the sample, reference the Sam Web Cookbook or email nova_sam@fnal.gov.
The following SAM datasets are available:
Reco art rootfiles: prod_reco_S14-09-29_fardet_numi_keepup
CAF, blinded information removed: prod_caf_S14-09-29_fardet_numi_keepup
The CAFs can be found here:
/nova/prod/data/keepup/S14-09-29/numi/fd/000XXX/XXXYY/
where XXX are the first three digits of the run number and YY are the last two.
S14-09-09 Keep-up Reconstruction¶
Keep-up reconstuction provides reconstructed art root files and CAFs. For quick testing the last three files from each of the below datasets is stored on bluearc and updated nightly. They can be found here:
/nova/prod/data/keepup/samples/
ND¶
Near detector reconstruction is currently lacking calibration, which will affect the output of any module depends on it. The most notable example is FuzzyKVertex, but the MichelE filters could also be affected. Slicer, CosmicTrack and KalmanTrack produce output independent of reconstruction.
Near detector reconstruction is also currently lacking channel masks and data quality information.
The following SAM datasets are available.
Reco art rootfiles: prod_reco_keepup_S14-09-09_neardet_numi_preshutdown_fullgain_cold
CAF: prod_caf_keepup_S14-09-09_neardet_numi_preshutdown_fullgain_cold
The CAFs can also be found on bluearc:
/nova/prod/data/keepup/S14-09-09/numi/nd/000XXX/YYYYY/
where XXX are the first three digits of the run number and YYYYY is the complete run number.
FD¶
Far detector keep-up processing is still underway. When this effort is complete, the goal is to have processed all of the data after the cease of the neutrino hunt, but before the shutdown. Since the processing is still underway, these datasets will grow.
FD calibration constants are currently only available through diblock 7, but averaged constants are used beyond that point.
The following SAM datasets are available:
Reco art rootfiles: prod_reco_keepup_S14-09-09_fardet_numi_preshutdown_posthunt
CAF, blinded information removed: prod_caf_keepup_S14-09-09_fardet_numi_preshutdown_posthunt
The CAFs can be found here:
/nova/prod/data/keepup/S14-09-09/numi/fd/000XXX/YYYYY/
where XXX are the first three digits of the run number and YYYYY is the complete run number.
Special¶
No Mask ND¶
ND datasets with bad channel masks turned off. Runs included from the full gain, pre-shutdown era (runs 10377 -> 10407)
RECO: prod_reco_S14-10-15_neardet_numi_fullgain_cold_preshutdown_mccalib-nomask
PID: prod_pid_S14-10-28_neardet_numi_fullgain_cold_preshutdown_mccalib-nomask
CAF: /nova/ana/nu_e_ana/S14-10-28/nd_nomask/data (Bluearc location only)
Only the CAF files contain filled "nuecosrej" variables.
FD Cosmic Trigger, Summer 2014, Post-Neutrino¶
Datasets¶
The datasets for these files can be found in SAM. For PID, there is also a small sample living on Bluearc. For CAFs, the entire sample lives on Bluearc. All of the official datasets specify the run range for the the 4db and 7db configurations based on the Stable Partition Milestones wiki page. The datasets also require subruns (files) to pass the data quality cuts.
The SAM datasets and Bluearc locations are as follows:
PID - 4db:
prod_pid_S14-05-12_fardet_cosmics_p1_4db_good
/nova/prod/data/S14-05-12/fd/pid/ (Run numbers less than 14702)
PID - 7db
prod_pid_S14-05-12_fardet_cosmics_p1_7db_good
/nova/prod/data/S14-05-12/fd/pid/ (Run numbers greater than or equal to 14702)
CAF - 4db:
prod_caf_S14-07-11_fardet_cosmics_p1_4db_good
/nova/prod/data/S14-07-11/cosmic/fd/caf/ (Run numbers less than 14702, excluding runs 14563-14594 which were 9 diblock stress tests.)
CAF - 7db
prod_caf_S14-07-11_fardet_cosmics_p1_7db_good
/nova/prod/data/S14-07-11/cosmic/fd/caf/ (Run numbers greater than or equal to 14702)
The underlying reco and pidpart(sans LEM) datasets exist only in SAM:
pidpart - 4db:
prod_pidpart_S14-05-12_fardet_cosmics_p1_4db_good
pidpart - 7db:
prod_pidpart_S14-05-08_fardet_cosmics_p1_7db_good
reco - 4db:
prod_reco_S14-05-08_fardet_cosmics_p1_4db_good
reco - 7db:
prod_reco_S14-05-08_fardet_cosmics_p1_7db_good
Comments on Livetime¶
In terms of livetime, the 4db sample is much larger than the 7db sample. The 4db PID sample is roughly ten times larger than that used for Neutrino 2014. The four diblock CAF sample is currently twice the size of the Neutrino sample, although it's growing. The CAFMaking process has been tedious due to CAFMaker crashes*, and the sample will never be as large as the PID sample as a result. The 7db sample, in terms of livetime, is roughly the same size as the Neutrino 2014 sample. However, that detector configuration has 75 percent more detector mass. The same caveat applies to the size of the CAF sample.
Processing history¶
The processing history for these files is as follows:
Reco: S14-05-08 - Notably, this tag resolves the KalmanTrack association bug
PID: S14-05-12 - Loose nue preselection
CAF: S14-07-11 - Uses LiveGeometry, but gets the CAF metadata right.