# Nus Validation¶

## Brief¶

• To use generic validation: write a yaml configuration file pointing to your files; choose appropriate names and an accessible location.

The following refers to four datasets (data, mc) (old, new), but can be extended to other cases

Note that many workflow choices are to avoid excessive copy-pasting.

### Preparing your macro and running a test¶

• Your macro should be flexible to work on both datasets + be hadd_cafana or hadd able
cafe -bq NDDataMC.C+ $OUTFILE_OLD true$NDDATA_OLD $NDMC_OLD cafe -bq NDDataMC.C+$OUTFILE_NEW true $NDDATA_NEW$NDMC_NEW


Depending on the type of validation, each line might need to be run with different tags/test releases
• Test your macro interactively over a single file/run.
cafe -bq NDDataMC.C+ $OUTFILE_OLD_TEST true$NDDATA_OLD_TEST $NDMC_OLD_TEST cafe -bq NDDataMC.C+$OUTFILE_NEW_TEST true $NDDATA_NEW_TEST$NDMC_NEW_TEST


Try specific file names or a samweb query like

NDDATA_TEST="dataset_def_name_newest_snapshot "$NDDATA" and run_number 11264 and Online.Subrun 00" NDMC_TEST="dataset_def_name_newest_snapshot "$NDMC" and run_number 11264 and Simulated.firstSubRun 00"


(make sure files and snapshots exit in all cases).

• Validation/generic will use the spectra/histogram names in your <outfile> to write the website. These include the pretty selectors.
Any "cosmetic" changes can be done over the same spectra, shouldn't need to re-run. NDDataMC.C (and FDDataMC.C)splits the process via a boolean. You might want to limit the number of variables and cuts that are "formatted" in this step in order to quickly generate a preview of the validation.

### Webpage output¶

You will point the output (controlled by the yaml file) to the /nusoft/app/web area.
Within this area are users/ directories. For more official validation you can send it to /nusoft/app/web/validation/nus/ and a designated sub-directory.

• Remember to clear any failed attempts from the /nusoft/app/web area.

### Running over the full dataset¶

• If concats are available, just replace the test above with the filenames
• For large datasets, run the spectra-saving section on the grid. This one sends 40 jobs
submit_cafana.py -n 40 -r $RELEASE -t$TESTREL -o $OUTPUT_DIR NDDataMC.C$OUTFILE true "$NDDATA" "$NDMC"


Note that your release and optional test_release must be consistent with your datasets.
Check your progress: jobsub_q --user USERNAME
• Once all your grid jobs are done, hadd_cafana the results, apply format as you did with the test, create the validation website
hadd_cafana $OUTFILE_NEW pnfs2xrootd \path\to\pnfs\file.*of40.root  Once you've got the hadd'ed output file, you can then run the macro again but this time skipping the make spectra section: cafe -bq NDDataMC.C+$OUTFILE_NEW false


### Troubleshooting¶

• Low hanging fruit: are you using the correct tags? datasets? all needed packages in your test release? did you re-compile?
Check the commit history for the package: Is the error related to a recent change?
• Did you search in slack?
• Branch doesn't exist: check your test release for consistency (StandardRecord, CAFAna, MCReweight). You might need to fix the variable definition
• It says that branch doesn't exist but I totally see it: check the requirements of the Var; try using xx.xx.xx.xx to xx.xx.* and report it
• Complaints from nan/inf: some variable might not be filled with default value. Identify and report.
• Occasional segfault on large dataset, not test: some variable might be ill-defined, or not properly filled. Identify and report
• Things break with no cut, ok otherwise: there might be a problem with LID variables
• Empty new_histograms: is the variable being filled in the new cafs? is a default value assigned out of range?
• Weight errors: are weights applicable to both datasets?