Project

General

Profile

Preparing Datasets for Analysis

Our data-processing pipeline produces easy-to-fit histograms from reconstructed positrons. This involves multiple steps comprised of a mixture of art jobs and Python scripts.

0) Getting and building the code for use with art (see here)
1) Processing positrons: RatioEast art module converts times & energies to histograms
2) Combine subruns & runs with hadd
3) Convert ROOT histogram sets to pickled data dictionaries with thist_converter.py
4) Fitting with pyfitter.py and/or other code

Processing with art

Input: an official 'production' dataset. These have names like gm2pro_daq_full_run1_HighKick_5040A_silverList and can be found on the official Run1 Production Data page [[g-2:Production_Run1_Data|here]]. One file corresponds to a single subrun by definition. The files live in their own directories on PNFS, stored on Fermilab's magnetic tape system, but you typically won't access them directly except for testing purposes. For full data-processing we will typically use the 'production script' to submit art jobs to the grid, which means that you will specify the dataset name and the needed files will be looked up automatically.

RatioEast module: Sudeshna's art module is parallelized on the grid by the production script. It is run by the FHiCL configuration file gm2ilratio/fc/ratio_grid.fcl. Each grid job reads its own set of 'Recon East' positrons (GlobalFit art records) and applies two randomizations to each positron time: 1) uniformly-distributed scrambling across a maximum width of one time bin, and 2) random assignment of each time to one of four histograms. The first randomization suppresses the effects of 'fast rotation', and the second randomization fills the histograms used to construct the ratio method fit. The module also constructs the pileup histograms (in both time and energy).

Output: ROOT histograms (one set of histograms per subrun). Each file contains sets of U, V, & T histograms (for one particular randomization of the data) and time and energy histograms needed to perform and validate the pileup subtractions. All of these histograms are created separately for each calorimeter, as well as a set with all calorimeter data together.

Running the art Module Locally

TODO: words words words

Example

jstaplet@gm2gpvm03:~ $ . /cvmfs/gm2.opensciencegrid.org/prod/g-2/setup
g-2 software

--> To list gm2 releases, type
ups list -aK+ gm2

--> To use the latest release, do
setup gm2 v9_35_00 -q prof

For more information, see 
  https://cdcvs.fnal.gov/redmine/projects/g-2/wiki/SoftwareReleases

jstaplet@gm2gpvm03:~ $ cd /gm2/app/users/jstaplet/irmathing
jstaplet@gm2gpvm03:/gm2/app/users/jstaplet/irmathing $ . localProducts_gm2_v9_33_00_prof/setup

MRB_PROJECT=gm2
MRB_PROJECT_VERSION=v9_33_00
MRB_QUALS=prof
MRB_TOP=/gm2/app/users/jstaplet/irmathing
MRB_SOURCE=/gm2/app/users/jstaplet/irmathing/srcs
MRB_BUILDDIR=/gm2/app/users/jstaplet/irmathing/build_slf6.x86_64
MRB_INSTALL=/gm2/app/users/jstaplet/irmathing/localProducts_gm2_v9_33_00_prof

PRODUCTS=/gm2/app/users/jstaplet/irmathing/localProducts_gm2_v9_33_00_prof:/cvmfs/gm2.opensciencegrid.org/prod/g-2:/cvmfs/gm2.opensciencegrid.org/prod/external:/grid/fermiapp/products/common/db

jstaplet@gm2gpvm03:/gm2/app/users/jstaplet/irmathing $ mrb s
local product directory is /gm2/app/users/jstaplet/irmathing/localProducts_gm2_v9_33_00_prof
----------- this block should be empty ------------------
---------------------------------------------------------
The working build directory is /gm2/app/users/jstaplet/irmathing/build_slf6.x86_64
The source code directory is /gm2/app/users/jstaplet/irmathing/srcs
----------- check this block for errors -----------------------
----------------------------------------------------------------

gm2 -c ratio_grid.fcl -s /pnfs/GM2/daq/run1/offline/gm2_5039A/runs_16000/16419/gm2offline_full_19224405_16419.00100.root

Running on the Grid

TODO: words words words

mrb g gm2analyses

Edit ${MRB_SOURCE}/CMakeLists.txt and comment out the gm2analyses parts like this:

# gm2ilratio package block
set(gm2ilratio_not_in_ups true)
include_directories ( ${CMAKE_CURRENT_SOURCE_DIR}/gm2ilratio )
## gm2analyses package block
#set(gm2analyses_not_in_ups true)
#include_directories ( ${CMAKE_CURRENT_SOURCE_DIR}/gm2analyses )

and this
ADD_SUBDIRECTORY(gm2ilratio)
#ADD_SUBDIRECTORY(gm2analyses)

This gets us the grid submission script, but skips building gm2analyses (unless you do mrb uc or mrb g again and CMakeLists.txt is overwritten).

Then set up the environment to use the production script:

cd ${MRB_SOURCE}/gm2analyses/ProductionScripts/produce
source ../setup

Finally submit jobs to the grid:

./gridSetupAndSubmitGM2Data.sh files_config.dat --daq --ana --tag Ratio --fhiclFile ratio_grid.fcl --sam-dataset gm2pro_daq_full_run1_60h_5039A_goldList --njobs 10 --memory 4 --cpu 2 --output-dir /pnfs/GM2/scratch/users/jstaplet/asdftest/ --localArea

Crafting FHiCL parameter scans

TODO: palabras

Remember:
  • source localProducts_X/setup
  • mrb s
  • source ${MRB_SOURCE}/srcs/gm2ilratio/py.env or else you'll get messages about not finding commands (e.g. fcl_scan.py or gridSetupAndSubmitGM2Data-fclscan.sh) or Python modules (e.g. pyfcl).

This assumes an already-built gm2ilratio, but a new shell.

Example scan config file:

physics.analyzers.MuonCoincidence.Emin = range(1400,2000,100)
physics.analyzers.MuonCoincidence.binWidth = 148.5,149,149.5,150

Then use that scan configuration like this:

jstaplet@gm2gpvm03:/gm2/app/users/jstaplet/irmathing $ fcl_scan.py srcs/gm2ilratio/fcl/ratio_grid.fcl scan.cfg 
Reading scan config from "scan.cfg"...
Loaded scan configuration:
  physics.analyzers.MuonCoincidence.Emin range = [1400, 1500, 1600, 1700, 1800, 1900]
  physics.analyzers.MuonCoincidence.binWidth range = (148.5, 149, 149.5, 150)

Found file srcs/gm2ilratio/fcl/ratio_grid.fcl in directory /gm2/app/users/jstaplet/irmathing/srcs/gm2ilratio/fcl
Prepending path 
  /gm2/app/users/jstaplet/irmathing/srcs/gm2ilratio/fcl
to FHICL_FILE_PATH (but you should DOUBLE-CHECK that fhiclcpp is
grabbing the file at the path that you expect!)

Reading parameters from srcs/gm2ilratio/fcl/ratio_grid.fcl...
...done.

Exporting filename/configuration list to "files_config.dat"...
...closing "files_config.dat".

Writing FHiCL config files with parameter permutations...
srcs/gm2ilratio/fcl/ratio_grid-001.fcl:
  physics.analyzers.MuonCoincidence.Emin:1400
  physics.analyzers.MuonCoincidence.binWidth:148.5
srcs/gm2ilratio/fcl/ratio_grid-002.fcl:
  physics.analyzers.MuonCoincidence.Emin:1400
  physics.analyzers.MuonCoincidence.binWidth:149
srcs/gm2ilratio/fcl/ratio_grid-003.fcl:
  physics.analyzers.MuonCoincidence.Emin:1400
  physics.analyzers.MuonCoincidence.binWidth:149.5
srcs/gm2ilratio/fcl/ratio_grid-004.fcl:
  physics.analyzers.MuonCoincidence.Emin:1400
  physics.analyzers.MuonCoincidence.binWidth:150
srcs/gm2ilratio/fcl/ratio_grid-005.fcl:
  physics.analyzers.MuonCoincidence.Emin:1500
  physics.analyzers.MuonCoincidence.binWidth:148.5
... 
srcs/gm2ilratio/fcl/ratio_grid-023.fcl:
  physics.analyzers.MuonCoincidence.Emin:1900
  physics.analyzers.MuonCoincidence.binWidth:149.5
srcs/gm2ilratio/fcl/ratio_grid-024.fcl:
  physics.analyzers.MuonCoincidence.Emin:1900
  physics.analyzers.MuonCoincidence.binWidth:150

Note that a table of all files, and their corresponding parameters, are saved to files_config.dat. (This file will be needed to run all of these FHiCL files as grid jobs.)