Project

General

Profile

Reproducing the nue 2018 Analysis results

Standard release is R18-06-11-2018ana.a

This is art2 release, and contains the final cross-section weights. Branch is R18-06-11-2018ana-br.

Selection cuts, binning and axis can be found in CAFAna/Cuts/NueCuts2018.h

Nue reconstructed energy is kNueEnergy2018, trained separately for RHC and FHC, definition can be found in CAFAna/Vars/NueEnergy2018.h

Ana2018 is a FHC and RHC joint analysis, to reduce analysis complexity, we defined a cut kIsRHC

So that we can combine neutrino and anti-neutrino cuts/vars everywhere by looking at spill information in CAF files.

The central value cross-section weight for this analysis is kXSecCVWgt2018

Documentation

An event linking to the executive summary and all related technotes is here.

For reference, a similar event exists for numu here

Setup

Setup the release and check out CAFAna

setup_nova -b maxopt -r R18-06-11-2018ana.a
newrel -t  R18-06-11-2018ana.a rel_2018ana
cd rel_2018ana/
srt_setup -a
addpkg_svn -b CAFAna R18-06-11-2018ana-br 

More about releases and release branches.

Files: concat files of MC, Data, File-systematics MC

Find all concacted decaf nue 2018 files here: Nue 2018 concat files

Selection and Binning Tuning

ND Data/MC comparisons

The macro that fills spectra from the official data sets is

CAFAna/nue/Ana2018/nue_data_mc_validation.C

To create your own, run on the grid with
submit_cafana.py -n 100 -ss -r S18-04-07 -t /path/to/S18-04-07/testrel -o /pnfs/nova/scratch/users/your-name/path/to/output/directory /path/to/CAFAna/nue/Ana2018/nue_data_mc_validation.C {fhc,rhc} {fhc,rhc}_datamc.root true

Then hadd the results together.
You'll need to format these histograms to prepare for plotting. This is done with the same macro
cafe -bq -nr /path/to/CAFAna/nue/Ana2018/nue_data_mc_validation.C {fhc,rhc} {fhc,rhc}_datamc.root false

You should now have three root files, one for data, one for MC no decomp, and one for MC with decomp. These root files are the input to the plotting script, CAFAna/nue/Ana2018/plot_validation_datamc.py
mkdir {fhc,rhc}_plots
python plot_validation_datamc_2018.py -b {fhc, rhc} {fhc,rhc}_datamc_data_validation.root {fhc,rhc}_datamc_mc_validation.root {fhc,rhc}_datamc_mc_decomp_validation.root

This will give you all of the Data/MC comparison plots, other than the ND Energy/PID spectrum. Because the Energy/PID plot has fancy axes, this had to be moved to a root macro (CAFAna/nue/Ana2018/plot_nd_spectra.C) that is run with
cafe -bq -nr /path/to/CAFAna/nue/Ana2018/plot_nd_spectra.C {fhc,rhc}

The input files here are hard-coded into the macro. Currently it points to the official Data/mc files, but you can change this to point to your own.

Decomposition and Extrapolation

In ana2018, FHC used combo decomposition(BEN + Michel) decomp. RHC used proportional decomposition. Signal decomposition are the same as numu decomposition as before.
Prediction generators are: NueComboExtrapGenerator for FHC and NuePropExtrapRHCGenerator for RHC, can be found in CAFAna/Prediction/PredictionGenerator.h
NB: NoExtrapGenerator is also always needed, because we need the no-extrapolated sample to correct our Peripheral sample prediction.

For the details on how to use PredictionGenerator, see Sec. Systematics

FD Rock Prediction

A non-negligible number of rock neutrino events enter the detector and are sliced. This is accounted for using by making a PredictionNoExtrap from special FD rock overlay files from production. These datasets are

  • FHC
    prod_caf_R17-11-14-prod4reco.g_fd_genie_nonswap_fhc_nova_v08_full_v2
    prod_caf_R17-11-14-prod4reco.g_fd_genie_fluxswap_fhc_nova_v08_full_v2
  • RHC
    prod_caf_R17-11-14-prod4reco.g_fd_genie_nonswap_rhc_nova_v08_full_v3
    prod_caf_R17-11-14-prod4reco.g_fd_genie_fluxswap_rhc_nova_v08_full_v3

Yet another step is involved. To dramatically reduce the burden on production, only those spills whose neutrino interaction left some light in the detector are reconstructed and CAF'd.

We have to determine this duty factor of CAF'd spills. This duty factor can and does change between FHC/RHC and swap/nonswap. The number of simulated spills is taken from the production website: 10k spills are generated for each analysis file. Then, a CAFAna job must be run over each of these datasets to determine the number of spills that were CAF'd for each dataset. To do this, we make a Spill histogram, of any variable, for each loader to count the spills that made it to CAF. The duty factor for each dataset is then the ratio of these two total spill counts.

Systematics

For systematics, they can only be registered once, the details on how much each systematics shifts can be found here

All the systematics are defined in CAFAna/Systs
You can also find the list of systematics uncertainties used in ana2018 in the systematics technotes or here

TLDR: we make predictions for each systematic uncertainty; file systematics require alternative MC files, direct systematics use weights or ratios defined in CAFAna or stored in CAF files; Oscillation fitting requires interpolated systematics ratios as input, the PredictionInterp class and its derived class take care of it.

Creating the systematically shifted predictions

Code lives in CAFAna/shared/Ana2018. "Shared" since the process is essentially identical for nue and numu. Please see the guide on making the systematically shifted predictions for further details.

Please see the following for the list of locations of the latest prediction files.

Using the systematically shifted predictions

Systematically shifted predictions are organized and accessed through the PredictionSysts objects, CAFAna/Prediction/PredictionSystJoint2018.h

A series of functions that will return lists of systematics to be used in the analysis is in CAFAna/Analysis/JointAna2018Systs.h

Specific examples of usage can be found in CAFAna/nue/Ana2018 (e.g. joint_fit_2018_tools.h and some fit macro)

Sensitivity

This is a header file including all the tools for oscillation fit studies: CAFAna/nue/Ana2018/joint_fit_2018_tools.h

Please always use real ND Data for Fit or sensitivity studies. GetNuePredPath:CAFAna/nue/Ana2018/joint_fit_2018_tools.h shows you where all the final official predictions are. isFakeData should be false, if you want to use real ND Data.

Sensitivity plots in sensitivity technote are made by CAFAna/nue/Ana2018/FitandFC/sensitivity2018.C

There are instructions on what the input attributes mean in the macro. I'll only highlight one thing here for the first-time user: always run with MakeFile=true first, this will generate a root file contains your slices. Then, run with MakeFile=false, to get your slices plotted.

For future sensitivity plots, i.e. the blessed sensitivity plots. Those are made by macros in: CAFAna/nue/Ana2018 /reach

Time projection sensitivity is made by this macro: reach/reach_2018_dCPfractions.C

Future sensitivity slices plots are made by: "reach/jointsensitivity.C". Supporting header file is saved in "/nova/ana/users/syu/Util" as it has some conflicts with official header file. The FitVar is extra constrained in these plots.

Fit

We performed a joint fit in ana2018: nue+numu(4 quantiles) for RHC+FHC.
Countors are made by "FitandFC/joint_fit_2018_contours.C".
Slices are made by "FitandFC/joint_fit_2018_slices.C"

Joint Fit scripts

The result plots we show usually are contours (2D information) and slices (1D information). To get these plots you’ll need to run:


nue / Ana2018 / FitandFC/joint_fit_2018_contours.C

nue / Ana2018 / FitandFC/joint_fit_2018_slices.C


Each script is used for both fitting and plotting the fit results. The arguments for the scripts are the next:

Contours:

std::string decomp = "combo»,(the decomposition method for your analysis, the possible variants are «prop», «combo», «nxp» )
bool createFile=false («true» - produce fit and write result into the file, «false» - pure plotting option, you already have fit results and want to make pretty plot representing it )
bool corrSysts = true («true» - fit with systematic, «false» - statistic only fit)
TString options="joint_realData_both_onlyIH", (what kind of fit you want, possible options: «joint», «numuOnly», «nueOnly»; «realData» or «fake2017» data; «FHCOnly», «RHCOnly» or «both»; «onlyNH» or «onlyIH»)
bool dmsqSurf = false(«true» - th23 - dmsq32 contour),
bool th13Surf = false («true» - th13-delta contour),
In case of dmsqSurf = false && th13Surf = false it will be delta-th23 contour
bool fccorr=false (with FC corrections or not),
bool zoomIn = true (zoom some contours, changed axis values fo several plots),
bool fillContour = false (fill the contour of just lines)
bool smooth = true (smooth corrections or not)
bool fc_overlay = false (debug option for comparing contours with FC with contours with no FC)

Slices:

The options std::string decomp, bool createFile, bool corrSysts and TString have the same sense as in the previous «Contours» part. 

bool th23slice (th23 slice)
bool octantSlice (split into upper and lower octant or not)
bool dmsqSlice (dmsq32 slice)
bool th13Slice (th13 slice)
In case of th23slice = false && dmsqSlice=false && th13Slice=false it will be deltaCP slice
bool fccorr (with FC corrections or without
)
bool fourierFit (smooth FC corrections)

Some tips

One more time, please always use real ND Data for Fit or sensitivity studies.

For Fit and sensitivity first-time users, here are some suggestions:

  • Always start with statistics only case.
  • Always start with "combo" for FHC, "prop" for RHC, "prop" for NumuPredictions.
  • Always merge peripheral. If you see any error complaining about 23 vs. 27 difference, check your Peripheral sample first before you start to confuse.
  • Never include keyword _realData_ in your input options. realData in Fitting macros means use FD real data. You should always use ND real data for fit and sensitivity study. See Sec. Sensitivity
  • Always set createFile to true to generate files first, then set it false to plot. Same with what I mentioned above in Sec. Sensitivity
  • Always disable any attributes related to FC such as fccorr etc. FC related, see Sec. Feldman-Cousins Corrections
  • Please, don't overwrite the /nova/ana/nu_e_ana/Ana2018/Results/ folder. There are fits and results made for 2018 analysis, in case if you want your own files, please, change outdir variable in scripts to point your local dir.
  • There is debug option inside the script, switch it to false if you don't want a lot debug plots.

Fitting will make input slices to FC correction, the input slices are made by: "FitandFC/joint_fit_2018_slices.C"

NB: In Fitting macro, if you set all the options of slices/contours type as false, they will fall back to the default as deltaCP slices/contours.
NB: In FHC+RHC analysis, Cherenkov and neutron systematics are used as extra systematics seeds along with the normal FitVars we seeding. That was a decision made to smooth the deltaCP slices for a short time fix. In FHC-only result, we added the Cherenkov systematics as the extra seed.
NB: This tip belongs to Fit and FC connecting area. Michel systematics was implemented in the last minute, it was added very late in the fit. Close to the end of last analysis, things were rushed to finalize, there might be some error/sig fault/ aborts when you play around with this systematics especially when you try to reproduce FC with official predictions. If you are trapped by this, please ask in #nue channel.

Feldman-Cousins Corrections

Feldman-Cousins Corrections is one of the most computationally intensive parts of the analysis. In previous years, this have taken 4+ weeks to complete on the Fermigrid. The 2018 analysis brought the addition of antineutrino data, and therefore demanded even more computational resources. Because of this, our official Feldman-Cousins corrections were produced using supercomputers at NERSC, which require a NERSC account and resource allocation (more details on this can be found here), however it is possible to run small scale FC jobs on the GPVMs if need be. The relevant macros (in S18-04-07+) are
  • CAFAna/nue/Ana2018/FitandFC/make_fc_slices_nersc_2018.C
  • CAFAna/nue/Ana2018/FitandFC/make_fc_surfaces_nersc_2018.C
  • CAFAna/nue/Ana2018/FitandFC/make_fc_mh_nersc_2018.C
  • CAFAna/nue/Ana2018/FitandFC/make_fc_oct_nersc_2018.C

To run these, you'll first need to set an environment variable that points to the FCInputs and other dependencies:

export FCHELPERANA2018_LIB_PATH=/pnfs/nova/persistent/users/ddoyle/FCHelper2018-NERSC/FCHELPERANA2018_LIB_PATH_v3.1

The slice and surface macros each will run a number of psuedoexperiments in each bin, over a range of bins, as specified by the given arguments. For example
cafe -bq -nr /path/to/CAFAna/nue/Ana2018/FitandFC/make_fc_slices_nersc_2018.C 5 0 1 true 998 999 1 delta_LO both true 

will run 5 points in bins 0 and 1 of the delta_LO slice, normal hierarchy. The numbers 998 and 999 are unique identifiers to avoid overwriting existing output files. These macros can run using any beam mode and with or without systematics. When the gaussian delta chi squared is very high, the Feldman-Cousins procedure requires many millions of puedoexperiments to be accurate, too many to produce in any reasonable amount of time. The binlist argument gives you the option to read from a text file of bins that have a low enough delta chi squared that can be reliably corrected. The startidx and endidx then specify the start and end indices of an array filled with the binlists to process. You can take a look at CAFAna/nue/Ana2018/generate_bin_lists.C to see how these lists were generated. This macro also creates a summary of bins to be corrected. Generally, we corrected every bin in the slices and left out many bins in the contours. In each bin we corrected, we required 3,000-5,000 psuedoexperiments for a total of 8.1 million psuedoexperiments between all of our official plots.

The mass hierarchy rejection and lower octant rejection numbers are corrected with make_fc_{mh,oct}_nersc_2018.C, and take the number of psuedoexperiments to fit and similar unique identifiers as input.

cafe -bq -nr /path/to/CAFAna/nue/Ana2018/FitandFC/make_fc_{mh,oct}_nersc_2018.C 5 998 999

These each required 45,000 psuedoexperiments.

The machinery was never implemented to run these macros at scale on the Fermigrid but if that's what you need to do, take a look at CAFAna/nue/Ana2017/joint_fit_2017_fc_submit.C for inspiration.

MRE/MRBrem