Test Beam Analysis Software¶
Welcome to the Test Beam Analysis wiki. This will provide all the information required to get started on test beam analysis.
It will give a general overview of the data processing journey and touch on all relevant details. A lot is described in more detail elsewhere on the wiki, so you are encouraged to look around for more information if required.
Data, data, data¶
Test beam data is collected by two DAQs: 'detector DAQ' and 'beamline DAQ'. The detector DAQ is the usual NOvA DAQ used by all NOvA detectors. The beamline DAQ is specific to the beamline components (wire chambers, time-of-flight system, Cherenkov detector, beamline triggering) and collects data for each of these subsystems for particles in the tertiary beamline, before they enter the NOvA detector. The DAQs are decoupled and each are operated entirely independently of each other, with different output files, start/stop times, crashes etc.
There are many detector trigger streams: beamline, DDT, cosmic, activity, spill etc. The cosmic and activity triggers can be used outside of a beam spill to obtain a sample of cosmic events for calibration etc. The full beam spill, 4.2s once a minute, every minute, is also recorded by the detector DAQ in the 'spill' stream. The stream of most interest for analysis is the beamline stream. This is triggered by the beamline components, based on various trigger conditions as a particle traverses the tertiary beamline. Once a trigger is made, the beamline DAQ initiates readout of all the beamline DAQ components and a signal is sent to the detector spill server to read out the activity in the detector.
The outcome of tertiary particles passing through the beamline then data saved in separate output files, with multiple files for the detector data each representing a single trigger stream. In order to process the beamline-triggered events together, the raw beamline data file must be combined with the raw beamline-stream detector data file. This is the first step of offline processing (details below).
Following this, one has a data file containing NOvASoft data products, for both the beamline and the detector. We then apply calibration and reconstruction to this data. Analysis can then proceed!
We will take advantage of the CAFAna framework in NOvA to do analysis, but this is not yet set up for test beam use. These instructions will be updated when it is!
Production data files¶
See Data Definitions for an up-to-date list of all the data which has been processed and is available.
The following datasets have been made by production for our use in analysis:
Running the software¶
This section will explain how to process the data and the various stages of the procedure.
In general, if the data have been processed by production and are available for our use (see above), you do not need to process the data yourself unless you have specific requirements. Please proceed carefully if you need to do your own data processing.
Users will likely not need to do this themselves, these instructions are for reference. See Data Definitions for available, ready-processed, data.
There are three ways to process the raw data into offline, NOvASoft-formatted, data files. One may process either the detector data or the beamline data separately, or alternatively both together. When processed separately, the results data file contains information from just that system. Merging the two data streams results in time-matched data from both the detector and beamline for each event, corresponding to a single beamline trigger.
To process a detector file, use
nova -c daq2rawdigitjob.fcl -s <input file name>.root -o <output file name>.root
To process a beamline file, use
nova -c beamline2rawdigit.fcl -s <input file name>.root -o <output file name>.root
Combined processing¶To merge the data from both and produce a single output file with data from the beamline and detector inside, use
nova -c testbeam2rawdigit.fcl -s <input detector file name>.root -o <output file name>.root
- the beamline files must be looked up separately and placed in the working directory for this job to find them
- the TDU trigger time offset must be provided in the fhicl configuration
- If you have a set of detector Beamline stream files in a directory and want to pull in the corresponding Beamline root files, you can run the following bash script (inside the directory with the files):
#!/bin/bash FILES="testbeam*Beamline.raw" for file in $FILES do export fname=$file fetch_tb_beamline_files.py done
One may make use of a script provided by the production group (huge thanks!) to do the beamline file matching when submitting jobs to the grid. Jobs are submitted using the submit_nova_art.py script using
submit_nova_art.py -f <config file>
where the configuration file which is passed as an argument contains the following information
--jobname <job name> --defname <detector file definition> --config testbeam2rawdigitjob.fcl --tag <tag> --mix fetch_tb_beamline_files.py --dest <destination on disk> --outTier out1:artdaq --files_per_job <N>
The definition which is passed to the defname parameter is for the detector files. The files are found and split into each job (using the files_per_job argument). Once on the grid node, the fetch_tb_beamline_files.py script is run to find time-matched beamline data files and then the art job runs over all the files to make the output file, which is copied back to the specified destination.
Additional arguments which may be added to this configuration are found here: https://cdcvs.fnal.gov/redmine/projects/novaart/wiki/Submitting_NOvA_ART_Jobs#submit_nova_artpy and include
--njobs <N> --testrel <test release> --reuse_tarball --opportunistic --print_jobsub --copyOut
It is the user's responsibility to ensure the data files are prestaged before submitting the jobs. For more information, refer to https://cdcvs.fnal.gov/redmine/projects/nova_sam/wiki/SAM_web_cookbook#Pre-staging-Data-from-Tape
To check if files are cached, one may use
cache_state -d <definition>
and to pre-stage, if required,
samweb prestage-dataset --defname=<definition> --parallel 4
Please be aware of the impact your actions may have on the computing resources being used by NOvA and don't pre-stage huge amounts of data without checking with the production group.
If multiple files are processed in parallel, for example on the grid, one may use the file concatenation art configuration to produce a single output file for analysis, if required:
nova -c concat_files.fcl -s <files to concatenate> -o <output file name>.root
Note the output file name is required to be specified.
Not yet available.
There are two reconstruction chains which must be applied: detector and beamline.
Detector reconstruction is still being developed and this page will be updated to reflect changes when they occur.
To run the full beamline reconstruction (including digitizer hit finding, ToF reco, Cherenkov reco, wire chamber track making), use
nova -c beamlinerecojob.fcl -s <input NOvASoft-formatted data file>.root
One may run reconstruction over a full data set using submit_nova_art.py with the following example configuration:
--jobname <job name> --defname <artdaq-tier data definition> --config beamlinerecojob.fcl --tag <tag> --dest <destination on disk> --outTier out1:reco --files_per_job <N>
Again, ensure any data files which are required are cached.
If you want to run over a dataset interactively (probably not recommended, which is why the tools aren't available!), you could write a bit of bash like this:
for file in `samweb list-definition-files <definition name>`; do echo `samweb locate-file $file | cut -d: -f2 | cut -d"(" -f1`/$file >> files.list; done
and then run using art with the -S option:
art -c <config>.fcl -S files.list
Once we have reconstruction in place it'll be trivial to produce a configuration which runs all the required test beam reconstruction.