Reproducing the nus 2018 Analysis » History » Version 18

Version 17 (Sijith Edayath, 07/01/2018 11:58 PM) → Version 18/22 (Sijith Edayath, 07/02/2018 01:51 AM)

h1. Reproducing the nus 2018 Analysis


*Work in progress* as of April 16th 2018

First of all, familiarise yourself with the "Executive Summary": and associated technical notes (linked within EC doc-db) for the Nus18 analysis.

h2. Set up an appropriate release

Setup the summer 2018 analysis tag (when it is available). All appropriate macros live in @CAFAna/nus/Nus18/@
setup_nova -b maxopt #development currently, tag soon

h2. BDT Training Per Data taking Period.

ROOT based TMVA (Multi-variate analysis) tool-kit is used to develop the PID (Particle Identifier) for cosmic rejection for the Nus18 analysis. Since FD is at the ground-surface the in-flux of cosmic events are of range ~145kHz. A fraction of these cosmic events mimics the NC events. The properties of the reconstructed shower variable are used to train the network, in this context it is a BDT (Boosted Decision Tree). 12 different variables, each corresponds to one of the reconstructed shower properties are used for training BDT. In this analysis, BDT is trained separately for FHC and RHC. Also FHC training is done for 3 different samples; which are @1) Period1 (coarse timing, low gain), 2) Period2 (low gain and fine timing) and 3)Period3 & 5 (High gain and fine timing)@. timing) . Total we have 4 different samples and for Nus18 analysis we have trained BDT with each of these samples separately.
The TMVA is trained using a subset of the the dataset both Monte-Carlo (MC) files and cosmic data files (used 1/3 of the total dataset for training purpose)
Using the @CAFAna/nus/Nus18/BDTTrain18/ProducingTree.C@, @CAFAna/nus/Nus18/PIDs/TMVA/@
a ROOT tree (named "training.root") $training.root$) is produced with variables to be uses for training as leaf.
The tree should contain two branches one corresponding to the NC signal events which extracted from the MC dataset (used truth condition to get the true NC events), the other branch
is made from cosmic data file and corresponding to the cosmic events. For getting the most NC like cosmic events which passes all NC selection criteria,

the tree is filled with events which passes through following conditions. Details of the selection summarized below can be found in @CAFAna/Cuts/NusCuts18.h@
1. Cosmic veto to remove obvious cosmics and second analysis diblock mask.
2. Event quality cut which removes the fuzzy events and issues due to FEB flashes etc.
3. Containment cut, which makes sure that the event is fully contained in the detector.
4. Fiducial cut, helps to select only well reconstructed events.
5. Selects events with number of hits $>$ 25 and classic CVN NCID score $>$ 0.2.
6. Cosmic rejection cuts explains in section. section \ref{cosmic_cut}

Unlike the Nus17 analysis, the training event energy is not restricted in any particular region of the spectrum (In Nus17, only events selected between 0.5 GeV to 4 GeV were used).
As a second step, the TMVA network is trained using the macro @CAFAna/nus/Nus18/BDTTrain18/TrainingBDT.C@ @CAFAna/nus/Nus18/PID/TMVA/@
with the ROOT tree "training.root" $training.root$
which is having separated NC and cosmic background events. As we have discussed, the TMVA method using for this analysis is BDTA, the weight files are saved as a UPS product in /nova/data/pidlibs/products/ncid/v01.03/NULL//lib.

Different aspects of the training and properties of the separating variables used for the analysis like covariance and over-training check etc can be done using the macro @CAFAna/nus/Nus18/TMVA/TMVAGui.C.@ @CAFAna/nus/Nus18/PID/TMVA/Application.C.@ (Correlation and over-training check for each of the 4 samples)
The weight files saved in @/nova/data/pidlibs/products/ncid/v01.03/NULL//lib@ corresponding to each data-taking period. In @CAFAna/Vars/NusTempVars.h@, four variables are defined.In @CAFAna/Cuts/NusCuts18.h@, a variable kNCCosRej18Alt is defined and is coded to select the correct BDTA weight for the event based on its run

h2. Correcting Calorimetric energy

h2. Producing ND spectra

h2. Producing FD prediction

h2. Producing FD cosmic background spectra

For this we calculate the cosmic background from the cosmic data by period and compare to the out-of-time numi cosmic background.

cafe -bq CAFAna/nus/Nus17/ComputeNus17CosBkgd.C

which creates individual spectra, by period. This output is then fed to:
cafe -bq CAFAna/nus/Nus17/PrintNus17CosBkgd.C

which creates the resulting final cosmic background "prediction" from data and spits out the number of events.

h2. Producing FD data spectra

cafe -bq CAFAna/nus/Nus18/nus18_box_opening.C

Creates energy, time spectra for selected events when running on the unblinded FD dataset.
Also spits out text files listing information needed to create event displays or interrogating the individual selected events for more details on variable values.

h2. Producing covariance matrix

h2. Producing 1D and 2D contours

h2. Comparison with Nus17

h2. Producing Feldman-Cousins corrected contours

Dung Phan will add this section