Project

General

Profile

Tutorial for Analyzers and Using the Gird

This page attempts to walk new analyzers through running the MicroBooNE software using the Fermilab batch system (aka "the grid") it is intended for those who know nothing about using the grid or MicroBooNE analysis code. It doesn't assume any knowledge of linux or per se, but you will probably get the most out of this if you are vaguely familiar with concepts of a command line and programming. Currently, this tutorial focuses on generation and simulation of Monte Carlo. In the future, it will be expanded to handle reconstruction and analysis of data.

Prerequisites

For starters, you will require a Fermilab computing account and access to the MicroBooNE gpvms (uboonegpvm0X.fnal.gov). You will also need a grid proxy to run jobs in the batch system. Instructions for acquiring these can be found "here:"http://www-microboone.fnal.gov/start.html. Make sure you are able to log on to one of the uboonegpvms, and have X11 forwarding allowed. If either of these doesn't work, you should ask your advisor for help.

Getting Started

Once you have managed to log onto a uboone machine, create a directory in /uboone/app/users/<your username>, or create the directory /uboone/app/users/<your username> if it doesn't exist. From there, you need to set up ups and uboonecode:

source source /cvmfs/uboone.opensciencegrid.org/products/setup_uboone.sh
setup uboonecode v06_43_00_01 -q prof:e14

The first line sets up the ups framework MicroBooNE and lar uses to manage the different builds of its software. It also sets up some experiment specific environment variables you will need for batching running and using SAM.

Once you have uboonecode setup, you should be able to start the tutorial by running "make_uboone_tutorial.sh:"

make_uboone_tutorial.sh

This will set up some files in your working directory. The one we are going to start with is "prod_muminus_0-2.0GeV_isotropic_uboone.xml"

How to generate Monte Carlo

XML files and You

Next, open prod_muminus_0-2.0GeV_isotropic_uboone.xml in your favorite text editor (for some advice in choosing a text editor, look here).

XML is a markup language similar to html which is designed to encapsulate data in a human and computer readable format. Like html. xml elements are enclosed within <element></element> tags. The first element you should look for is the <numevents> element:

<numevents>50</numevents>

This element tells you the total size of the simulation is 50 events. You will notice this element is embedded in another element:

<project name="&name;">
A project consists of multiple stages. Each stage can be thought of as a separate grid-job which runs a particular step of the simulation or reconstruction. Projectcs consist of setting which are common between multiple stages such as:
  1. the number of events to simulate
  2. the version of the simulation being used
  3. how metadata is declared
  4. etc.

Do not worry about these for now, instead scroll down to the "gen" stage:

<stage name="gen">

There are several elements here, but let's focus on <numbjobs> for now. This is the number of jobs which will launch on the worker node when you invoke the job submission command. If we wanted to generate a large number of events (say, 500) we would not want to do it in one job, but instead break it into multiple submissions.

Launching your First Job

Let's get some simulation running. From the command line, type the following command:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage gen --submit

you should get an output like this:

Invoke jobsub_submit
jobsub_submit finished.

if not, contact an expert as it's likely something else is wrong.

jobsub_submit sends your command to generate 50 muon events to the grid, where it needs to wait for a free computer to run on. You can check the status of your job by typing:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage gen --status

you should see something like this:

Project prod_muminus_0-2.0GeV_isotropic_uboone:

Stage gen: 0 art files, 0 events, 0 analysis files, 0 errors, 0 missing files.
Stage gen batch jobs: 0 idle, 1 running, 0 held, 0 other.

Stage g4 output directory does not exist.
Stage g4 batch jobs: 0 idle, 0 running, 0 held, 0 other.

Stage detsim output directory does not exist.
Stage detsim batch jobs: 0 idle, 0 running, 0 held, 0 other.

Project prod_muminus_0-2.0GeV_isotropic_uboone_reco1:

Stage reco1 output directory does not exist.
Stage reco1 batch jobs: 0 idle, 0 running, 0 held, 0 other.

Project prod_muminus_0-2.0GeV_isotropic_uboone_reco2:

Stage reco2 output directory does not exist.
Stage reco2 batch jobs: 0 idle, 0 running, 0 held, 0 other.

Stage mergeana output directory does not exist.
Stage mergeana batch jobs: 0 idle, 0 running, 0 held, 0 other.

Note that project.py tracks all the stages in your xml file, and not just the one currently running. Depending on how many jobs are currently running on the gird, your job may be running right away or waiting for a open slot (idle.)

Fermilab's Computing Division maintains a variety of tools to measure and monitor the conditions of the grid:

Introduction to Fhicl Files

While you are waiting for your jobs to run, we'll take a look at what's going on behind the scenes. One of the elements embedded in <stage> is <fcl>. The <fcl> element dicates the fhicl file which is run by lar in your grid job. This tutorial is more about the nuts and bolts of running lar, and doesn't get too into depth on fcl files. But you can find more information on writing and interpreting fcl files here: Guide to Using FCL files in MicroBooNE

When you launch a grid job, project.py will search for the fhicl file according to directories on the environment variable $FHICL_FILE_PATH. You can see these paths by typing

echo $FHICL_FILE_PATH

Yikes! That is a lot of directories! Fortunately, we have a script which will automatically search the $FHICL_FILE_PATH for you for a given fhicl file. Let's try looking for the fhicl file run during the "gen" stage:

find_fhicl.sh prod_muminus_0-2.0GeV_isotropic_uboone.fcl

You'll see some kind of output like this (don't worry if the directory isn't exactly the same):

==========================
Found fhicl file(s):
/grid/fermiapp/products/uboone/uboonecode/v06_14_00/job/prod_muminus_0-2.0GeV_isotropic_uboone.fcl 

Open this .fcl in your favorite text editor. You should see something like this:

#include "prodsingle_common_uboone.fcl" 

process_name: SinglesGen

outputs.out1.fileName: "prod_muminus_0-2.0GeV_isotropic_uboone_%tc_gen.root" 

physics.producers.generator.PDG: [ 13 ]            # mu-
physics.producers.generator.PosDist: 0             # Flat position dist.
physics.producers.generator.X0: [ 128.0 ]

The first thing you'll notice is the "#include" statement. This means this fhicl file is inheriting settings from another fhicl file. You can search for this fhicl file using the same find_fhicl command.

You will also notice numerous lines starting with physics.producers. These lines control particular aspects of the generation of single muons. We will get to these parameters later.

After your Job Finishes: Checking the Output

Once your job finishes (remember you can check the status with project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage gen --status), the next step is to check the output and make sure everything ran normally. This is accomplished through project.py and --check:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage gen --check

you should see something like this. Again, your directories will be different:

Checking directory /pnfs/uboone/scratch/users/joelam/tutorial/v06_14_00/gen/prod_muminus_0-2.0GeV_isotropic_uboone
Checking root files in directory /pnfs/uboone/scratch/users/joelam/tutorial/v06_14_00/gen/prod_muminus_0-2.0GeV_isotropic_uboone/14950233_0.
50 total good events.
1 total good root files.
1 total good histogram files.
0 processes with errors.
0 missing files.

viola! You now have 50 good events run through the generator stage! In addition to checking all the root files, project.py also creates a "files.list" file in the directory "/pnfs/uboone/scratch/users/$USER/tutorial/v06_14_00/gen/prod_muminus_0-2.0GeV_isotropic_uboone". This file is used by project.py to determine the input for the next stage.

Launching your Second Grid Job

The next step in simulation is to take the generated muons and run them through the detectors GEANT simulation. This is accomplished by running project.py again, only this time specifying the stage to be g4:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage g4 --submit

Note that you do not have to specify an input file at any point. project.py does this for you. It is of course possible to specify a certain input file or dataset, and we will get to that eventually.

In the meantime, you can check the status of these jobs by running:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage g4 --status

you should see output that looks like this:

Project prod_muminus_0-2.0GeV_isotropic_uboone:

Stage gen: 1 art files, 50 events, 1 analysis files, 0 errors, 0 missing files.
Stage gen batch jobs: 0 idle, 0 running, 0 held, 0 other.

Stage g4: 0 art files, 0 events, 0 analysis files, 0 errors, 0 missing files.
Stage g4 batch jobs: 1 idle, 0 running, 0 held, 0 other.

Stage detsim output directory does not exist.
Stage detsim batch jobs: 0 idle, 0 running, 0 held, 0 other.

Project prod_muminus_0-2.0GeV_isotropic_uboone_reco1:

Stage reco1 output directory does not exist.
Stage reco1 batch jobs: 0 idle, 0 running, 0 held, 0 other.

Project prod_muminus_0-2.0GeV_isotropic_uboone_reco2:

Stage reco2 output directory does not exist.
Stage reco2 batch jobs: 0 idle, 0 running, 0 held, 0 other.

Stage mergeana output directory does not exist.
Stage mergeana batch jobs: 0 idle, 0 running, 0 held, 0 other.

You'll notice you still have 50 events in the gen stage. project.py will keep track of the entire history of your project, so long as the directories and files are there.

Advance Usage of project.py

By now, you've probably noticed all of our project.py commands follow a pattern of <xml file>, <some stage>, <some action>. XML files describe the stages of an analysis and the order in which they are run. XML files also describe how each stage receives input files. Each file contains a list of stages. Each time you call project.py with a specific xml file and stage, it looks into the xml file to decide what to run and how to run it.

Actions by contrast specify what you want project.py to do. For most users, the actions used most often are submit (which submits jobs to the grid), check (which checks the output of jobs after they've returned from the grid) and status (which tells you if your job is running or idle). More advanced actions include declare (which declares files to SAM) and define (which generates a SAM dataset based on your configuration).

All of the actions project.py is capable of running may be found by running (warning! lots of text):

project.py --help

Finishing your MC Project

Once the g4 stage grid job completes, you can move onto the next step "detsim." You should do the same thing you did after the "gen" stage, run:

project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage g4 --check
project.py --xml prod_muminus_0-2.0GeV_isotropic_uboone.xml --stage detsim --submit

Once detsim finishes, you should run reco1, reco2 and mergeana in the same way.

90% of the work you will do with project.py will follow this paradigm of --stage <stage> --submit, --stage <stage> check. There are a few tricks for speeding up this process detailed below, however in the next stage we will introduce how to run over actual MicroBooNE data.

Reconstructing Data using Project.py

MicroBooNE's Data Blinding Policy

Before you begin looking at / reconstructing data, you should understand MicroBooNE's data blinding policy.

In a nutshell, most of the neutrino data from the booster neutrino beam is "blinded," meaning the collaboration has mutually agreed to not look at this data until we are ready to publish a low energy excess analysis. This is enforced by restricting the read access of files which contain data from the booster neutrino beam to only the production and data management group. This has the practical effect that not all data is analyzable. Regular users are locked out from accessing this data, and if you try and run code over it, your job will fail. This tutorial involves the reconstruction of cosmic data, and is not subject to blinding.

The blinding policy in detail may be found here Blinding of MicroBoone Data.

Fcl Files for Reconstruction

The full chain of reconstruction code is always being modified and improved. As a result, it would be impossible to keep this tutorial at the cutting edge of reconstruction, or even official reconstruction versions usable for analysis. The fcl files included in the following section will work with the input datasets, but there is no guarantee these should be used for a physics analysis. Chances are, we have a more modern version of reconstruction being used by analyzers. Make sure you consult with your working group, or the production team about what the most updated reconstruction version is.

That being said, the following instructions (and dataset) should be usable with any version of the reconstruction. You can swap fcl files (and uboonecode versions) at will and cook up your own xml file.

Using SAM to Reconstruct Data

MicroBooNE runs reconstruction in two stages, an initial 2-dimensional reconstruction and a more sophisticated 3-dimensional reconstruction. Open the xml file titled "prod_reco_data.xml," and scroll down to stage "reco1."

You'll notice a few differences from the xml file we used to generate MC muons. The first, is we now have a stage parameter <inputdef>. This parameter tells project.py to input data from a dataset defined by SAM (Sequential Access to Metadata). Most of MicroBooNE's data is stored on magnetic tape, and is inaccessible from disk. SAM facilitates accessing and copying these files from tape. There is some more information about SAM and how to use it here: Data Handling and SAM

You can see how many files and events make up a data set by running:

samweb list-definition-files --summary prod_extbnb_swizzle_inclusive_tutorial

alternatively, you can run:

samweb list-files --summary defname:prod_extbnb_swizzle_inclusive_tutorial

you should see an output which looks like this for both commands:

File count:    33
Total size:    7006376102
Event count:    165

The dataset prod_extbnb_swizzle_inclusive_tutorial contains 165 events spread over 33 different files. If you want to see the parameters which make up a definition, you can use samweb describe-definition :

samweb describe-definition prod_extbnb_swizzle_inclusive_tutorial

this is a good way to learn what sam metadata is available, and how to use it.

Look in prod_reco_data.xml for the line <numjobs>. You will see <numjobs> = 33; which is the same number of files in the dataset. You should also look for the filed <maxfilesperjob>, which is set to 1.

When you submit a job with project.py using a dataset as input, project.py splits the dataset up into multiple jobs. This is because datasets can be arbitrarily large (up to millions of events!), and we do not want jobs running on the grid over multiple days. Project.py makes this decision based on the <numjobs> and <maxfilesperjob> are specified. project.py will always launch <numjobs> regardless of how large the dataset is. If you launch a reconstruction job with 10k events in 1 job, be prepared to wait a long time for that job to finish. In fact, it won't finish, since by default your job will automatically terminate after 8 hours of run time.

<maxfilesperjob> tells project.py you only want SAM to copy over so many files for a given job. Combined with <numjobs>, <maxfilesperjob> defines how many files from a dataset are processed based on:

Nfiles = <maxfilesperjob>*<numjobs>

If you do not specify a <maxfilesperjob>, project.py will evenly distribute the files across <numjobs> as best it can. Doing this with <numjobs> = 1 is potentially disastrous, as SAM will try and copy potentially thousands of files to a grid node. Do not do this!

With 33 files in our dataset, <numjobs> =33 and <maxfilesperjob> =1, we expect this xml file to launch 33 jobs, each with 1 file per job, and reconstruct the entire dataset. Go ahead and check my math with:

project.py --xml prod_reco_data.xml --stage reco1 --submit

Make sure to run check when your jobs finish:

project.py --xml prod_reco_data.xml --stage reco1 --check

Understanding Numevets

If you look in prod_reco_data.xml, you'll notice the <numevnets> field is set to 1M events. But, we now our dataset only contains 165 events. Where do the extra events come from?

The answer involves understanding what <numevents> actually means. <numevents> tells lar, the underlying event processing framework, the maximum number of events to process in a given grid job. Setting this to 1M in prod_reco_data.xml tells project.py to process up to 1M events per job.

Archiving your Files with SAM4Users

Files located on /pnfs/uboone/scratch have limited lifetimes. The scratch space is a read/write pool, and is designed for quickly accessing files across many different locations. It is not designed as a permanent home for a large number of processed data or simulation files. To keep your files permanently, you have two options:

  1. Copy your files to a permanent storage directory such as bluearc (/uboone/data) or persistent dCache space (/pnfs/uboone/persistent). Neither of these is recommended, as MicroBooNE's storage space on both is limited.
  2. Archive your files on tape using sam4users to declare and catalog your files. This is the preferred option.

Creating your own Dataset with SAM4Users

Since SAM4Users is not setup by default with uboonecode, we need to set up another ups product, fife_utils:

setup fife_utils

NOTE THAT YOU SHOULD NOT SPECIFY A VERSION WHEN SETTING UP FIFE_UTILS The product is actively being developed and bugs are being fixed all the time. Make sure you are using the most up-to-date version!

The basic SAM4Users command registers your files in SAM, and creates a dataset which encapsulates these files. You run it by doing after running project.py --check:

sam_add_dataset -n prod_reco_data_tutorial_$USER -f /pnfs/uboone/scratch/users/$USER/tutorial/reco1/v06_26_01_04/prod_reco_data/files.list

The first thing SAM4Users does is rename your files. This is because files in SAM must have unique file names. Note the practical effect of this is if you want to run a downstream stage (say reco2 or mergeaana) on these files, you need to re-run project.py --check so your file.list will pick up the updated file names.

Moving Files to Tape

Once your files have been declared, it's trivial to move them to the tape-backed area running:

sam_move2archive_dataset -n prod_reco_data_tutorial_$USER

Working with Larger Samples

The preceding sections should give you enough knowledge on how to run jobs using input from SAM, as well as generating your own MC samples. The following sections deal with a bit more advanced topics, like generating large (10k) events MC samples are running over entire datasets.

Generating Beam MC

Open the file prodgenie_bnb_nu_cosmic_uboone_tutorial.xml. This will look a bit like prod_muminus_0-2.0GeV_isotropic_uboone.xml, with some important differences.

First of all, you'll notice there is no "gen" stage. It has instead been replaced with a "sim" stage which contains 3 different <fcl> fields. project.py supports running multiple fcl files in series, where it is assumed the input of stage N is the output of stage N-1. In this case, the sim stage runs three different fcls for each job:
  • prodgenie_bnb_nu_cosmic_uboone.fcl
  • standard_g4_spacecharge_uboone.fcl
  • standard_detsim_ddr_uboone.fcl

These last two should look familiar from prod_muminus_0-2.0GeV_isotropic_uboone.xml, as the fcl files run during the g4 and detsim stages respectively. But now, they will be run all at once to save time!

Second, you'll see <numevents> is now set to 10,000. When generating events, numevents is set to the total number of events you want to generate. The generation is divided into <numjobs> submissions, each with an equal number of events to add up to <numevents>. As <numjobs> is set to 200 in this file, project.py will launch 200 jobs with 10,000 / 200 = 50 events per file. In general, you should aim to have about 50 events per file when generating MC.

Go ahead and launch the jobs using:

project.py --xml prodgenie_bnb_nu_cosmic_uboone_tutorial.xml --stage sim --submit

You'll now have 200 jobs sitting in the queue:

project.py --xml prodgenie_bnb_nu_cosmic_uboone_tutorial.xml --stage sim --status

Project prodgenie_bnb_nu_cosmic_uboone:

Stage sim: 0 art files, 0 events, 0 analysis files, 0 errors, 0 missing files.
Stage sim batch jobs: 200 idle, 0 running, 0 held, 0 other.

When the jobs finish, run

project.py --xml prodgenie_bnb_nu_cosmic_uboone_tutorial.xml --stage sim --check

You'll notice the output is a little bit different than when we ran the single particle MC before. You should see output that says "Doing quick check of directory /pnfs/etc." This is because the <check> field is set to 1 in the xml file prodgenie_bnb_nu_cosmic_uboone_tutorial.xml. Setting <check> to 1 before launching jobs makes project.py run some of the file validation on the grid. You still need to run project.py --check when your jobs finish, but the check should go much faster since project.py does not have to open every single root file. This can be a big time save if you are running a large number of files. However, there is a downside to this. <check> will also declare the files to SAM, so you will not be able to archive these files using SAM4Users. For this reason, we recommend regular users do not enable <check> 1 </check>, but we do not forbid it. If you have a sample you need archived, but can't use SAM4Users for some reason, contact data management and they'll try and help you. Note that you do not need to archive this sample, as MicroBooNE maintains large-scale BNB MC processing available for everyone.

Reconstructing Beam MC from a dataset.

MicroBooNE attempts to maintain large-scale (millions of events) MC samples for analyzers and developers. These samples are organized into numbered "campaigns," the current of which is MCC8 (Monte Carlo Challenge 8). You can get a breakdown of what samples are available at this website.

The input to reconstruction is "detsim" level files for MC. These number of these events may be found in the 4th column You'll see for the BNB sample, prodgenie_bnb_nu_cosmic_uboone there are more than 2 million events available! Click on the "describe" link, and you'll get the dataset which corresponds to the list of events. You can input this dataset into a grid job using the same technique you used with the data, by defining an <inputdef> element in your xml file:

<inputdef>prodgenie_bnb_nu_cosmic_uboone_mcc8_detsim</inputdef>

You should see this line in the "reco" stage block of prodgenie_bnb_nu_cosmic_uboone_tutorial.xml, along with <numjobs>200</numjobs>. This means if you run the reconstruction, you will get a random set of 10k events (50*200) out of the 2.5 M BNB events. Try running the reconstruction with:

project.py --xml prodgenie_bnb_nu_cosmic_uboone_tutorial.xml --stage reco --submit

The project.py interface is adaptable enough to input any SAM dataset into a lar job (providing it exists!) This enables a lot of collaboration between users. Your friend (or advisor) can generate a set of MC events, declare them with SAM4Users, move them to the tape archive, and then hand off the dataset definition to you. You can then run reconstruction yourself by including the dataset name as an <inputdef> to a specific stage. SAM will magically locate the files for you, and deliver them to the worker node. The only thing to caution is SAM doesn't care about file size or network bandwidth, so you need to be very careful that <numfiles> and <maxfilesperjob> are set appropriately. Otherwise, your jobs will sit for hours on the grid waiting for input to appear. This is bad!

Running your own code with project.py

Project.py can run virtually anything which invokes "lar" as the primary executable. This means you can input arbitrary .fcl files, input arbitrary lar files, and execute arbitrary art modules on the grid using project.py. But there are a few rules to keep in mind:

  • All input files must be located on /pnfs space This means no copying data from /uboone/data or /nashome/. You job will most likely not work if you do.
  • The grid knows nothing about your environment It is possible to pass environment variables to the grid using the <jobsub> field, but it is not done by default.
  • Files that your code needs to run must be included in the uboonedata package uboonedata is a ups product that is regularly tagged, built and visible to the grid.

Project.py with custom input files

You can always specify files to be copied and read in by grid jobs by defining <inputlist> at the stage level. Project.py will launch <numjobs> jobs, dividing the inpulist as evenly as it can while still respecting <maxfilesperjob>. Just be sure the input files are located on /pnfs!

Example:

<inputlist>/pnfs/uboone/scratch/users/jiangl/mcc7.0_val/v05_08_00/reco2/prodgenie_single_proton_uboone_23/files.list</inputlist>

Project.py with custom fhicl files.

Just as you can run project.py using arbitrary input, you can run it with arbitrary .fcl files as well. All you need to do is replace the <fcl> field with the name of the file, and the directory the .fcl file is located in specified in the <fcldir> field. Otherwise, project.py searches directories pointed to by the environment variable $FHICL_FILE_PATH. It is also recommended you change the stage name in your xml file to something unique (so if your .fcl file is "analyze_data.fcl, you'd want <stage name="analyze_data">), but it's not mandatory.

Example:

  <stage name="T0Reco">
    <inputlist>run1_set1</inputlist>
    <fcl>/uboone/app/users/joelam/t0tagging/run_T0RecoAnodeCathodePiercing.fcl</fcl>
    <outdir>/pnfs/uboone/scratch/users/joelam/&release;/T0Reco/&name;</outdir>
    <workdir>/uboone/app/users/joelam/&release;/T0Reco/&name;</workdir>
    <memory>5000</memory>
    <numjobs>3250</numjobs>
    <jobsub>--disk=100GB --expected-lifetime=24h</jobsub>
  </stage>

Project.py with a custom build of larsoft or uboonecode

Checking out and building ups packages is beyond the scope of this tutorial. however there is a good walkthrough here

Fortunately, after building your own version of uboonecode, getting it to run on the grid is fairly simple. After you've built your release, run:

make_tar_uboone.sh local.tar

This will set up a tar ball of your release in the current working dir. You then need to copy the file to resilient storage (a subdirectory of /pnfs/uboone/resilient/users/&user;/) and then
add a line in the <larsoft> block of your xml file to tell project.py to copy the tarball from there to the grid:

  <larsoft>
    <tag>&relreco1;</tag>
    <qual>e14:prof</qual>
    <local>FullPathToTarBall/local.tar</local>
  </larsoft>

It is important to have your tarball in "resilient" storage rather than in /uboone/apps or some other place. Using the wrong storage volume might cause problems with dCache.
See Understanding Storage Volumes for more information about the use of different storage volumes, and Sample_XML_file for a sample XML file.

Best Practices for Submitting Analysis Jobs

Now that you have access to all of the data and MC, you are ready to submit thousands of jobs to run over millions of events, right?

MicroBooNE currently has very large data products and a lot of data. As a result, running over full datasets amounts to transferring 100s of TBs of data to remote nodes, and possibly from tape. This can be a significant strain on production resources, and is very inefficient because remote nodes are sitting idle waiting for input. As a result, please adhere to the following guidelines when developing, testing and running analysis code.

Best Practices for Using the Grid to Develop Analysis Code
  • Always test analyzer modules interactively before submitting to the grid.
  • In order to get an input file to test on, you can follow this procedure:
    1. Find the definition name for the data set you wish to test
  • The analysis tools (https://microboone-exp.fnal.gov/at_work/AnalysisTools/) page contains a list of all available datasets.
    2. Get the name of an input file from your chosen data set
        samweb list-definition-files <definition>
        # Tip: use head to just see the first file
        # samweb list-definition-files <definition> | head -n 1
        

    3. Find the path to the directory containing the file
        samweb locate-file <file_name>
        

    4. Check if the file has been staged for use (if not expect the next step to take some time)
        cat <path_to_directory>/".(get)(<file_name>)(locality)" 
        # E.g:
        # cat /pnfs/uboone/data/uboone/raw/swizzle_trigger_streams/mergebnb/prod_v04_26_04_05/00/00/60/98//".(get)(PhysicsRun-2016_4_28_0_17_54-0006098-00045_20160428T143449_bnb_20160501T071825_merged.root)(locality)" 
        
  • ONLINE or ONLINE_AND_NEARLINE - means the file has been staged and can be used immediately
  • NEARLINE - means the file is only available on tape
  • If your file is only on tape then it will need to staged before it can be copied to your user area for your tests. Staging will happen automatically when you try to execute the copy, but expect this to be a slow process.
    6. Get the access url of the file and copy it to your user area
        samweb get-file-access-url <file_name>
        ifdh cp -D <file_url> .
        
  • Before launching jobs on the grid, you should first check that your input files have been staged from tape using the command above. If you don't do this, your jobs will sit idle while the files are staged making them very inefficient.
  • You can pre-stage a whole dataset using samweb
        samweb prestage-dataset --defname=<definition>
        
  • Issue this command around 24 hours before you need to submit your jobs
  • Consider using nohup or screen as this command can take a while
  • If the dataset has already been staged then this command will tell you and you can submit your jobs right away
  • The files will remain on disk for around 30 days since the last time they were accessed before being flushed
  • When you are ready to launch jobs on the grid, test your code on an escalating number of jobs. First try one job, then 10, and then 100 (if you need that many).
  • Use developmental datasets instead of full datasets while developing. Developmental datasets contain roughly 200k events, so you should have reasonable statistics.
  • Grid jobs should take between 1 and 8 hours to complete. This can be tricky depending on what kind of module you are running. It is possible to combine files together in a single lar job by increasing the <maxfilesperjob> parameter, and if you have a very fast module with a total output less than 10 GB per file, this is a good option.
  • If you plan to launch more than 1k jobs, please get in touch with the data management conveeners or physics coordinators so we can understand your requirements.

Best Practices for Using the Grid to Run Analysis Code

When you have your analyzer nearly complete and are ready to run over large scale data or MC samples, please follow these guidelines:

  • First and foremost, notify the physics coordinators and DM conveeners of your request. Be specific in your request. Indicate what sample(s) you want to run over, what code you are running, what your timeline is, etc.
  • Once the DM team understands your request, begin prestaging your anticipated dataset. Prestaging begins the process of copying files from tape to cache space and speeds up the transfers to interactive nodes.
  • If you are trying to isolate a rare (less than 10%) process for MC, or are only interested in a select few data events (again, less than 10%), write a filter module which strips only events you are interested in to a separate file. You can then run over the stripped down file much faster and much more efficiently.
  • If running over an entire dataset, it is definitely preferred that your analysis code is part of a tagged uboonecode release. This way, the executable of your code can be mounted directly on a grid worker node over cvmfs. This reduces the needs to copy thousands of tarballs to remote workers.
  • Ideally, you want to limit the rate of job submission to 1k per minute. In practice, this can mean carving up the total dataset into several different datasets and submitting them one by one. The easiest way to do this is via "limit" and "offset" predicates in sam:
samweb create-definition <dataset_first1k> defname:<dataset> with limit 1000
samweb create-definition <dataset_first12> defname:<dataset> with limit 1000 and offset 1000

Using Gallery to Analyze Data and Monte Carlo

Up until now, we've only generated Monte Carlo and reconstructed it. But we haven't been able to see exactly what we've generated and plot some interesting variables (like say, energy). If you have run the data or MC samples through the mergeana stage, then you will end up with flat ntuples (anatrees) you can read in / manipulate with ROOT. This is probably the easiest way to begin analyzing data or MC.

However, anatrees may not have all of the variables you want, or need. A better approach is to direct read the lar objects from the art root files, which are the output of the reco2 stage. You cannot do this native root, but we do have a tool called gallery.

Setting up Gallery

In order to use gallery, we need to set up a few additional UPS products on top of uboonecode. You do this the same way we setup fife_utils previously:

setup gallery v1_04_03 -q e14:nu:prof
setup larsoftobj v1_23_02 -q e14:prof

Larsoftobj allows us to include the header files of various larsoft objects, which lets us include them in compiled code.

Looking inside an artroot File

Larsoft stores all event data as objects within an artroot file. You can get a catalog of all the different objects and what created them (producers) using the fcl file eventdump. Run this on a reco2 file from your single muon sample:

lar -c eventdump.fcl -s /pnfs/uboone/scratch/users/$USER/tutorial/v06_43_00/reco2/prod_muminus_0-2.0GeV_isotropic_uboone/837723_0/<file_name>.root -n 1

Since there is a lot of output, redirect it to a text file using the bash ">" operator:

lar -c eventdump.fcl -s /pnfs/uboone/scratch/users/$USER/tutorial/v06_43_00/reco2/prod_muminus_0-2.0GeV_isotropic_uboone/837723_0/<file_name>.root -n 1 > EventRecord.txt

You should be able to open EventRecrod.txt in your favorite text editor. Take a look around and note all the different modules and their data types.

Compiling and Running Gallery Code

Open the .cc file demo_ReadEvent.cc. Inside, you will see a lot of comments, as well as some examples on how to read in the event info and track info. You'll need to change one line of the code to read in one of your reco2 files from the single muon sample (line 52).

 vector<string> filenames { "path_to_your_file" }; //Change this to the full path of your muon reco2 file

If you look through the code, you'll se it's primary purpose is to make a ROOT histogram, TH1F h_events, which plots the event number of each event in the input file. You'll also see an example of how to pull tracks from the PandoraCosmic producer, how to iterate over tracks, and write out the track length to cout.

Once you are familiar with this code, you can compile the code by running the "make" command. This picks up the make routine in the makefile:

cat Makefile
#Makefile for gallery c++ programs.
#Note, being all-incllusive here: you can cut out libraries/includes you do not need
#you can also change the flags if you want too (Werror, pedantic, etc.)

CPPFLAGS=-I $(BOOST_INC) \
         -I $(CANVAS_INC) \
         -I $(CETLIB_INC) \
         -I $(FHICLCPP_INC) \
         -I $(GALLERY_INC) \
         -I $(LARCOREOBJ_INC) \
         -I $(LARDATAOBJ_INC) \
         -I $(NUSIMDATA_INC) \
         -I $(ROOT_INC) \
     -I $(CETLIB_EXCEPT_INC) \
     -I $(CETLIB_EXCEPT_INC)

CXXFLAGS=-std=c++14 -Wall -Werror -pedantic
CXX=g++
LDFLAGS=$$(root-config --libs) \
        -L $(CETLIB_LIB) -l cetlib \
        -L $(GALLERY_LIB) -l gallery \
        -L $(NUSIMDATA_LIB) -l nusimdata_SimulationBase \
        -L $(LARCOREOBJ_LIB) -l larcoreobj_SummaryData \
        -L $(LARDATAOBJ_LIB) -l lardataobj_RecoBase -l lardataobj_MCBase -l lardataobj_RawData -l lardataobj_OpticalDetectorData -l lardataobj_AnalysisBase

demo_ReadEvent: demo_ReadEvent.cc
    @$(CXX) $(CPPFLAGS) $(CXXFLAGS) $(LDFLAGS) -o $@ $<

all: demo_ReadEvent 

clean:
    rm *.o demo_ReadEvent 

The executable is demo_ReadEvent. Run it by typing:

./demo_ReadEvent

Making Histograms

Once demo event is done, you will see an output file called demo_ReadEvent_output.root in your working directory. You can open this in root, and look at the event numbers for the single muon sample you just produced.

root -l demo_ReadEvent_output.root
 h_events->Draw("hist")

A simple and quick XML test file

Here you can find a simple xml file sample used for quickly testing jobs: Sample XML file