Project

General

Profile

Matt's Summary of grid best practice

0. Run test jobs to check the workflow is correct and determine the memory usage, disk usage, and wall time.  Increase number of input files/job so that the job runs for 2-4 hours or until you reach the 35 GB disk limit on the grid worker node.


1. Set workdir in your project.py xml file to point to your directory on resilient dCache. 


2. If you are using a local build, copy your local.tar to resilient dCache. 


3. Use the most up-to-date project.py (which enables xrootd streaming, does not copy setup_uboone.sh to each grid worker node, and tars up non-root output files before copying them back from the grid worker node) 


4. Add --append_condor_requirements='(TARGET.HAS_CVMFS_uboone_opensciencegrid_org==true)’ to your <jobsub> and <jobsub_start> elements of your xml file. 


5. Write the output of your jobs to dCache scratch. 


6. Do not submit more than 10,000 jobs to the grid over a given 10 minute period 


7. Set <bookdir> to your directory on /uboone/data. 


8. If you are doing a large production, run project.py --check periodically throughout the job rather than waiting until the end. 


9. If you are using file lists you must add <schema>root<\schema> to each stage in your xml file (assuming the input file is a ROOT file) [If you are using a SAM dataset definition and reading in non-ROOT files then you must add <schema>gsiftp</schema> to each stage in your xml file] 


10. If you are submitting more than 500 jobs at one time, use project.py’s ability to create recursive SAM dataset definitions described in Herb’s Grid Best Practice slides