Project

General

Profile

BestPractices

Using ifdh cp gives you many options. This document will attempt to compare various
tasks people perform, and make recommendations.

Getting input files

Copying out output

If any of your output is going to BlueArc, which needs to be protected by locks,
you should:

group into one copy command

Use one ifdh cp command using the (backslashed in the shell) semicolon between source and destination sets; i.e.

   ifdh cp -D d1/*.log $LOGDEST \; d2/*.root $ROOTDEST \; ...

or even better, make a file of source an destination pairs, and use
ifdh cp -f filelist

or possibly use the ifdh addOutputFile / ifdh copyBackOutput set if you're putting lots of files
back to a common destination.

avoid using ifdh cp -r

While it works in many contexts; the cp -r facility is inefficient (particulary on-site to bluearc) and sometimes
probelmatic (still misunderstood overoad problems for uboone); it is better to make explicit file lists or do a couple of directory/* entries.

Use IFDH_STAGE_VIA where appropriate

You can have ifdh stage files via an SRM location. This is a Good Thing when working at sites (like SMU)
that have high network bandwith to their local SRM, and good bandwidth from their SRM to here, but not
from the worker nodes directly here. We are also investigating whether this will work for using
Amazon s3 storage when operating on the Amazon cloud.

To do this, you just:

export IFDH_STAGE_VIA=srm://host/path/...
jobsub_submit -e IFDH_STAGE_VIA ...

This will stage data through that SRM and have one job actually manage third party transfers back to fermilab.