Project

General

Profile

CPN documentation

BACKGROUND

The NFS mounted Fermilab Bluearc system, as used by the Intensity Frontier experiments,
is deployed in raid-6 file systems with no more than 50 to 100 physical disks per system.
To avoid head contention from the thousands of potential Fermigrid client processes,
we must limit the number of simultaneous data copies.

We require client processes to work with local copies on worker nodes,
copying the files to and from Bluearc with a 'cpn' wrapper script.

cpn is a very light wrapper around cp,
It takes out a lock with /grid/fermiapp/common/tools/lock, does the cp, then does lock free

cpn has been replaced by ifdhc, which should be used instead.
They use the same locks.

The locks are files under /grid/data/${GROUP}/LOCK/LOCKS.
The process is entirely client driven, with no potential points of failure other the Bluearc server itself.

USAGE

The 'jobsub' job submission script will take care of copying your files appropriately
https://cdcvs.fnal.gov/redmine/projects/ifront/wiki/UsingJobSub

For those not using jobsub :

Grid jobs should use the dynamic local area ${_CONDOR_SCRATCH_DIR}/work .
Jobs need to mkdir ${_CONDOR_SCRATCH_DIR}/work.
DO NOT put files directly into ${_CONDOR_SCRATCH_DIR},
as these would be copied back to /tmp on the submission node,
possibly filling the disk and causing global failures.

We recommend that you copy files between Bluearc and local disk
using the ifdh cp utility, which will use CPN locks to prevent
having too many concurrent copies running.

For example,

source /grid/fermiapp/products/common/etc/setups.sh
setup ifdhc
mkdir -p ${_CONDOR_SCRATCH_DIR}/work
FILE=B070717_160001.mbeam.root
ifdh cp  /minos/data/beam_data/2007-07/${FILE} ${_CONDOR_SCRATCH_DIR}/work/${FILE}

You can check the status of locks with the underlying CPN lock utilities:

lock status

or all locks globally with
lock statusall

LOGS

All ifdhc and cpn lock activity is logged.
The logs are kept short term as files in /grid/data/${GROUP}/LOCK/LOG
and long term as monthly files under /grid/data/${GROUP}/LOCK/LOGS

The log records look like
20140630.23:58:51.79.1.fnpc4014.11448.novaana.aurisano.5.131
The period-separated fields are
  • UTC YYYYMMDD
  • UTC time
  • seconds in queue waiting for a lock
  • seconds the lock was held
  • hostname
  • PID on the host
  • account
  • username from grid proxy if it exists
  • number of other locks
  • number of queue entries

RELEASES

ADMINISTRATION

HISTORY

DEVELOPMENT

PERFORMANCE

BLUEARC LOAD TESTS

CI for ITIL management