How to do an analysis » History » Version 32

« Previous - Version 32/49 (diff) - Next » - Current version
Matthew Toups, 01/23/2018 11:03 PM

How to do an analysis

This wiki is intended to detail how to perform an analysis of MicroBooNE data and MC. It should cover topics that every analyser will need to deal with.
It is very much under construction. Andy Furmanski is currently building it, so send him ideas for things to include (or just add them!)

Performing a data analysis on MicroBooNE is complex.

Getting the right data and MC, and normalising them

Run Periods

MicroBooNE has several different periods where the detector was in different conditions. This table summarises these.

Normalisation of data and MC samples

There are many considerations when comparing data and MC samples with sensible normalisations.

Usually, we like to normalise things by the corresponding POT (protons on target). Calculating this number, however, is not so simple. In addition, one usually needs to subtract off-beam data, or add in cosmic MC, to produce a realistic comparison, and for this you need to calculate an additional scaling which depends on the average beam intensity. details how these are calculated for the 5e19 open sample. For other samples (smaller or larger) one needs to follow the same method, but use the python script to calculate the total number of triggers and POT in each sample.

POT counting in data after optical filter using SAM

Use SAM isparentof clauses to find the grandparents of a set of filtered files (grandparents because of separate filter+merge steps):

$ samweb list-files --summary "defname: prod_reco_optfilter_bnb_v11_mcc8 and run_number 5500" 
File count:     11
Total size:     3077314432
Event count:    171

$ samweb list-files --summary "isparentof: ( isparentof: ( defname: prod_reco_optfilter_bnb_v11_mcc8 and run_number 5500 ) with availability anylocation )" 
File count:     11
Total size:     16423167976
Event count:    388

PMT timing offsets

Due to an intricacy of the PMT readout system, there is a delay between the (hardware) trigger time and the start of the unbiased readout window. This delay is different for different trigger types - 2 ticks for BNB, 4 ticks for NUMI, and 26 ticks for EXT.

The software trigger algorithm calculates a number of ticks from the start of the unbiased window, and therefore the software trigger window is in a slightly different place in on-beam and off-beam data. In MC samples the timing is designed to be close to the measured beam time, but due to uncertainties in the measurement it may not be exact.

Because of this, when applying cuts to "in-time" flashes, one must shift the definition of the beam window, as hit and flash times are given relative to the (hardware) trigger time. See slide 3 on for MCC7 and MCC8 beam windows.

Understanding the data format inside a LArSoft file

LArSoft uses artroot files. If you don't know how to read information from these, there are some examples at
This covers simple analysis (getting data products) and using LArSoft services and algorithms.

What's in a MicroBooNE final reco file

Getting simple data products (tracks, showers, etc)

Matching reco to truth, and using associations

Current best producers

As of October 2017, the reconstruction group advise using the following producers for analysis

Flashes - simpleFlashBeam and simpleFlashCosmic
Tracks -
Showers -

Using AnalysisTree

Using Gallery

Event Displays

There are several event displays available for MicroBooNE simulated/physics data:

LArSoft Event Display
Setup a uboonecode release and run

 lar -c evd_ub_data_truncated.fcl path/to/my/artroot/file.root 

_Note that this fcl file is only correct through the MCC8 release, possibility of different fcl files in newer releases.

In a clean environment, run

source /uboone/app/users/cadams/static_evd/

and then follow on-screen instructions.
To open event display, run -u /path/to/my/artroot/file.root

where the -u symbol tells the event display to use the microboone geometry.

Visit and look up the data file of interest.

Final stages - Calibrating, correcting, and dealing with systematics

Reconstruction corrections


MC reweighting

Correction weights

For MCC8, at production a "correction" weight is calculated to account for the fact that the beam simulation incorrectly accounts for re-decay of muons to produce electron neutrinos (this description might not be 100% accurate, but the point is it's wrong but we can fix it).
This is stored as "bnbcorrection" and should be applied to all events but particularly when making absolutely normalised nu_e event distributions.

Systematic variation weights

A wiki describing how to run the EventWeight package to produce systematic variation weights for GENIE and beam variations can be found here

Detector variations

To estimate detector systematics at the moment (Nov 2017) the plan is to produce special MC datasets with modified detector parameters. It is key here to use the same events such that there is no statistical variation between the event samples. The workflow for producing these is described at the following link:

Important/useful computing information

How to find the data and MC files you need using SAM

How to submit analysis jobs to the grid

How to use Sam4users to store your files on tape (and best practices)

How to make a merged analysis tree from a samweb definition containing a specific number of events

Running GENIE with different models in LArSoft -