MicroBooNE Reco Processing Example MCC8¶
These instructions are for running reconstruction on samples going into the MicroBooNE Cincinnati Workshop 2017. They do not cover all of the options that are possible within project.py and most importantly do not cover information about how to generate custom samples (through either custom fcl or modules). There are some general comments about grid submission snuck into the instructions, so enjoy and learn!
Configuring your project.py xml file¶
There is a template file to base you project xml file off of located here:
Copy this file into your app volume area:
cp /uboone/app/users/kirby/single_particle_mcc8_template/test_reco_dataset_template.xml /uboone/app/users/$USER/<name_your_working_area>/<name_your_file>.xml
Note that the /uboone/app/users/ area should be used to store configuration files, uboonecode builds, but should never be used to store data files. At the top of the xml file you will find this section while must be completed by the analyzer:
<!DOCTYPE project [ <!ENTITY user_id "put you user name here"> <!-- e.g. kirby, mrmooney, etc. this is the same as your kerberos principal UID and FNAL email --> <!ENTITY number_of_jobs "number of jobs to run"> <!-- This has to be equal to or less than the number of files in your dataset. check with samweb list-definition-files \-\-summary <your_defname> --> <!-- It should also be equal to the number of events you want divided by 50. so if you want 10K, set to 200. But read the line above again first. --> <!ENTITY defname "SAM dataset name"> <!-- name of the sample that you are running reco over --> <!ENTITY name "test_&defname;"> <!-- this is the test_<defname> from above --> <!-- Note that the name will be used for the name of output files so please use something reasonable --> <!-- Examples are here: <!ENTITY user_id "kirby"> <!ENTITY number_of_jobs "2"> <- I'm setting this to 2 for testing. it would normaly be something like 200 <!ENTITY defname "prod_muminus_0-2.0GeV_isotropic_uboone_mcc8_detsim"> <!ENTITY name "test_&defname;"> -->
You really shouldn't change the fcl files in this xml file. You should really only change the user id, the number of jobs, and the SAM dataset definition. But you can change whether or not your dataset had spacecharge or DDR on.
NOTE!!!! DO NOT IGNORE THIS and keep track of these settings for your sample!!!
Once you have edited your file, then you need to make sure that the staging and output directories are ready. You credit those files with these commands:
mkdir -p /pnfs/uboone/scratch/users/$USER #This creates a directory in the dCache scratch area mkdir -p /uboone/data/users/$USER #this creates a directory in the BlueArc data volume
Note that the dCache scratch area does not have a quota, but files have a limited lifetime before they are flushed from the volume (usually ~ 30 days). But there is NO guarantee that files will be stored permanently. While on the BlueArc data volume, there is a 1.5 TiB quota for each user but the storage is permanent. For full details look here: https://redmine.fnal.gov/redmine/projects/fife/wiki/Understanding_storage_volumes
Now you're ready to submit jobs. First setup the environment with a current version of uboonecode so that you get a current version of the larbatch UPS product.
source /cvmfs/uboone.opensciencegrid.org/products/setup_uboone.sh setup uboonecode v06_26_01 -q e10:prof cd /uboone/app/users/$USER/<name_your_working_area>/ project.py --xml `pwd`/test_muminus_reco.xml --stage reco --submit #this submits the "reco" stage of the project which is reco1+reco2 jobsub_q --user=$USER
The last command checks to make sure that the jobs have been submitted. You should see a list of jobs with status "I" equal to the xml variable number_of_jobs. Once all the jobs are complete, then you'll run this command:
project.py --xml `pwd`/test_muminus_reco.xml --stage reco --check #this checks the "sim" stage files to make sure they were produced successfully
If there are no error, then you can continue to reco and mergeana stage.
project.py --xml `pwd`/test_muminus_reco.xml --stage mergeana --submit #this submits the "mergeana" stage of the project which produces AnaTree files project.py --xml `pwd`/test_muminus_reco.xml --stage mergeana --checkana #NOTE CHECKANA!!!!! this checks the "mergeana" stage files to make sure they were produced successfully
You should now have files in three locations within /pnfs/uboone/scratch (again these files have a finite lifetime since they are in a scratch area!!!!). Note that these are the xml variables (e.g. "&tag;"), and so you will have to translate them:
And we need to move them to permanent storage. We will do this using SAM4Users utilities. http://microboone-docdb.fnal.gov:8080/cgi-bin/ShowDocument?docid=6896 These commands will have to be translated slightly but hopefully you understand what is being done. First we will make text files that contain the full paths to the files we've generated:
###ls /pnfs/uboone/scratch/users/&user_id;/&tag;/&relsim/sim/&name;/*/&name;*.root >& /uboone/app/users/$USER/&name;_sim_filelist.txt #this has to be translated and using the example above would become ls /pnfs/uboone/scratch/users/kirby/mcc8/v06_26_01/reco/test_muminus_reco/*/test_muminus_reco*.root >& /uboone/app/users/$USER/test_muminus_reco_filelist.txt ls /pnfs/uboone/scratch/users/kirby/mcc8/v06_26_01/mergeana/test_muminus_reco/*/ana*.root >& /uboone/app/users/$USER/test_muminus_mergeana_filelist.txt
Now we're going to setup FIFE utils UPS product, declare those files to SAM4Users, and then copy them to tape backed archive. But you much first come up with a dataset name. I recommend "$USER_mcc8_&name;_sim_v1" and if you regenerate, to increment the "v1" on the end. So this looks like:
source /cvmfs/uboone.opensciencegrid.org/products/setup_uboone.sh setup uboonecode v06_26_01 -q e10:prof setup fife_utils ##sam_add_dataset -n $USER_mcc8_&name;_v1 -f /uboone/app/users/$USERS/&name;_reco_filelist.txt sam_add_dataset -n $USER_mcc8_test_muminus_reco_v1 -f /uboone/app/users/$USERS/test_muminus_reco_filelist.txt sam_move2archive_dataset -n $USER_mcc8_test_muminus_reco_v1 sam_add_dataset -n $USER_mcc8_test_muminus_mergeana_v1 -f /uboone/app/users/$USERS/test_muminus_mergeana_filelist.txt sam_move2archive_dataset -n $USER_mcc8_test_muminus_mergeana_v1
At that point, the files are removed from dCache scratch space and moved to tape-backed,permanent storage. You will now need to access them through the SAM datasets definitions.