Project

General

Profile

Using Condor and Running on Grid from IF Cluster Machines (Old)

This page is out of date, but still contains some useful information.
A new page is being developed, but for now you can still refer to this page.

One should follow the instructions on the ifront wiki.
Sections of the instructions have been grabbed and put below.
Because of this copying, these instructions may fall out of date.

Jobs can be run under your own account on local GPCF batch nodes,
or under the novaana account on the large Fermigrid resource,
as described below.

Local Submission to IF Cluster Nodes

You submit jobs from interactive nodes like novagpvm01.
After login, you need to specify your group and setup access to jobsub

  export GROUP=nova
  source /grid/fermiapp/products/common/etc/setups.sh
  setup jobsub_tools

You can run a simple command very simply.

jobsub echo HELLO  # greet the world

jobsub /grid/fermiapp/common/tools/probe  # probe the worker node environment, you can also run this interactively.

To do real work, you will also need to write a script that performs the steps for running your jobs. Example scripts are sh_loop and if_nova_mc in the
/grid/fermiapp/nova/condor-scripts/ directory. Additionally, you may find it necessary to include in your (bash) script the command
source /grid/fermiapp/nova/novaart/novasvn/setup/setup_nova.sh
to ensure appropriate environmental variables are set when your script is executed.

The jobs are submitted by running jobsub, which takes the name of your script and associated parameters as its arguments,

  jobsub job_script args...

where job_script is the name of your script and args are the arguments for it. For example

  jobsub sh_loop 1000

You can also submit multiple copies of the same job to the queue, each identified by a process number given by the $PROCESS environmental variable defined for that copy of the job. For example

  jobsub -N 100 sh_loop 1000

will submit 100 instances of sh_loop.

Another available option is specifying the release of novasoft to use. The release is set using the -r flag,

  jobsub -r development sh_loop 1000

To check the status of your jobs, use the nova_q command provided by jobsub_tools.

Running on the Grid

The -g flag tells jobsub to submit the jobs to the grid. Before you can submit jobs to the queue you need to register.

Your grid jobs are authenticated by Grid Certificates based on your kerberos principal.
These certificates need to be registered for access to Fermigrid.

Submit a ticket to the Service Desk
*https://fermi.service-now.com/*
  • Service Catalog in the left frame
  • Create an Affiliation\Experiment Computing Account Request
  • Select the Experiment: E-929 (NOVA)
Once the registration is complete, you will need to log into gpsn01.fnal.gov one time, to :
  • Initialize the kcron system which lets your cron jobs get kerberos tickets
  • Make a crontab entry to keep your Grid Certificate proxies current.
  kcroninit 
and follow instructions, then:

Setup your crontab file to run kproxy every 2 hours, with :

echo '07 1-23/2 * * *  /scratch/grid/kproxy nova' | crontab

    Check this with
crontab -l

If you also need proxies for another experiment of which you are a member (minos in this example), you can do something like

echo '07 1-23/2 * * * /scratch/grid/kproxy nova
07 1-23/2 * * * /scratch/grid/kproxy minos' | crontab

Now, you are ready to actually run on the grid.

Input/Output file transfers

When running on the grid, be sure to copy your input files from the NOvA area to the machine on which the grid job is actually running. This is done using the ifdh cp command documented on the REX DH Wiki. This script originated with Minos, but is used by all IF experiments.

For security purposes, grid jobs have strict controls on where they can write and execute files. Grid jobs can execute from (but cannot write to) the /nova/app areas. Grid jobs can write to (but not execute from) the /nova/data and /nova/ana areas. General purpose analysis output should be written to /nova/ana, unless the total output volume is upwards of 1 TB. In that case, users should use non-volatile dCache (/pnfs/nova/scratch).

Your grid jobs run under the novaana account, not your own account.
Directories to which the files are copied back need to be group-writeable ( mode 755, set via chmod g+w or chmod 755 )

So you should also make sure that the files are written as group writeable, to give you access later.
The simplest way to ensure that the files are group writeable is to put umask 002 in the script you submit.
If you need to change permissions on files after they are written, this can be done by submitting a chmod command to the grid.

You can now also copy the files back under your own account by using the gridftp servers, via ifdh cp --force=expftp

Killing Condor Jobs

To terminate a condor job, first use the condor_q command to find the Condor ID of the jobs you wish to terminate. Then use condor_rm to remove them. Both of these commands are placed in your path when you run setup jobsub_tools

So, after the path in .bashrc is setup as described above, one can start killing jobs. To remove a particular job use

  condor_rm <Job_ID>

To kill all user's jobs, use

  condor_rm <User_ID>

Log and Error Files

When running on the Condor, regardless of whether the job is running on the grid, Condor will place a *.cmd, *.err, *.log, and *.out file in $CONDOR_TMP for each job run. This files will end with the date and time of the job run. In case of the job crashes or fails, these will usually be the best source of information regarding the cause of the problem, crash, or failure.

Monitoring via web

There are a couple webpages with information about status of condor jobs

Anatomy of a Job Script

The job_script is the command that will be run under the condor system. Typically this is an executable shell script written by the user to do some particular unit of processing. Those run by the batch system (either local IF nodes or the grid) generally takes a common form. Remember that on the local IF batch nodes the job runs as "you" with your account id and permissions; jobs that run on the grid run (generally) as "novaana", though they do run with the "nova" group id.

Also note that /nova/app (and /nusoft/app) will be mounted read-only, executable. That means scripts, exectuables and libraries can reside there but jobs will not be able to write to it. On the other hand, /nova/data and /nova/ana will be mounted writable, but no-exec. Your job can write to it (though for grid jobs, only if the directory is group writable) but executables and libraries can not reside there. Plan accordingly.

a) Setup

This part of the script should setup the desired work environment. Jobs do not run the user's normal .bashrc, .bash_profile or equivalent; on the grid they couldn't as home areas are in AFS and AFS is not mounted on the grid nodes, so don't even try to access AFS. If one has made use of jobsub's flags -r nova_release or -t test_rel_dir that may be sufficient. One can forego using the -r and -t flags to jobsub and setup the NOvA (or appropriate) work environment directly if desired.

b) Work Area

The variable ${_CONDOR_SCRATCH_DIR} defines a working disk area unique for the particular job. This section of the job can create any necessary subdirectories under that area. Processing should be done with this (or a subdirectory) as the current working directory. Output files should not be written directly to BlueArc while being generated; holding open file handles to network storage for long periods is not a good idea.

c) Fetch Input

Jobs that need input files should fetch them appropriately. If fetching from BlueArc then use remember to use cpn.

d) Processing

This is the heart of the job. It is here that the real processing is done.

e) Return Results

Once the real processing has is done any resulting output must moved from the job's scratch space to permanent storage (BlueArc or tape). Use cpn to move files to BlueArc.

f) Clean Up

Condor will delete files under ${_CONDOR_SCRATCH_DIR} from the worker node. Depending on the configuration of condor, some files might automatically be transferred back to the /nova/data/condor-tmp/ user area upon completion of the job_script, so it might be wise to delete any unnecessary files at this point.

Example

#! /usr/bin/env bash

# Useful predefined env variables:
#
# ${PROCESS} is the individual job # when multiple jobs are run as a cluster 
#    via "jobsub -N <n> script args"  values [0...n-1]
# ${_CONDOR_SCRATCH_DIR} this job's unique work area
# 
# Note: on the grid $USER and such will not be "you" 
#       it appears ${GRID_USER} is (and set even for local IF batch nodes)
#

#============================================================================
# Section (a):  Setup
# Define the work environment 
# (shown here is an alternative to using jobsub -r & -t flags)

function setup_novaoffline {
  source /grid/fermiapp/nova/novaart/novasvn/srt/srt.sh
  export EXTERNALS=/nusoft/app/externals
  source $SRT_DIST/setup/setup_novasoft.sh "$@" 
}

# define this for future I/O use ... use it for copying local <==> BlueArc
export CPN='ifdh cp --force=cpn' #(force only for local GRID jobs, not for off-site)

# pick a particular release
setup_novaoffline -r S12-11-11  

# optionally setup a test release
cd /nova/app/users/${GRID_USER}/test_rel_dir  
srt_setup -a

#============================================================================
# Section (b):  Work Area

cd ${_CONDOR_SCRATCH_DIR}
MYSUBDIR=mysubdir
mkdir $MYSUBDIR

#============================================================================
# Section (c):  Fetch Input
# Assume here the 1st arg to the script is the full path specified input file

MYINPUT=$1
$CPN $MYINPUT .
MYLOCALINPUT=`basename $MYINPUT`  # just the filename, not the path

#============================================================================
# Section (d):  Processing
# Assume here the 2nd arg is the .fcl file, and 3rd & 4th are output and hist
# file names (w/out directory or extension).  Add ${PROCESS} to the output 
# file names so that separate jobs in the same condor job cluster ( -N <n>
# which otherwise get the same script args) yield distinct filenames

MYFCL=$2
MYOUT=$3.${PROCESS}.root
MYHIST=$4.${PROCESS}.root
nova -c $MYFCL -o $MYOUT -T $MYHIST $MYLOCALINPUT 

#============================================================================
# Section (e):  Return Results
# These will succeed only if mysubdir is group writable

$CPN $MYOUT  /nova/ana/users/${GRID_USER}/mysubdir
$CPN $MYHIST /nova/ana/users/${GRID_USER}/mysubdir

#============================================================================
# Section (d):  Clean Up

rm -r $MYSUBDIR $MYLOCALINPUT $MYOUT $MYHIST

echo "end-of-job" 

Don't forget that this script must live on /nova/app and must that the execute bit set (chmod +x job_script).

Expert Section

CPN Lock Files

Each experiment should set their own lock files, which determine the behavior of CPN. These files are already in place and working for NOvA.

The NOvA lock files are located in /grid/data/nova/LOCK . The current lock files were created using the following instructions and typical values.

  mkdir -p  /grid/data/nova
  mkdir -p  /grid/data/nova/LOCK
  mkdir -p  /grid/data/nova/LOCK/LOCKS
  mkdir -p  /grid/data/nova/LOCK/LOG
  mkdir -p  /grid/data/nova/LOCK/LOGS
  mkdir -p  /grid/data/nova/LOCK/QUEUE
  mkdir -p  /grid/data/nova/LOCK/STALE
  chmod -R 775 /grid/data/nova/LOCK
  find /grid/data/nova/LOCK -type d -exec chmod 775 {} \;

Then create these files under LOCK, with typical values:

glimit 99 50 global open file limit
limit 20 5 open file limit
perf 3 required performance MB/sec before locking
PERF * measured performance MB/sec set by outside program like bluwatch
rate 10 target polling rate, per second
stale 10 100 ignore locks this old ( minutes )
staleq 600 ignore queue entries this old ( minutes )
wait 5 minimum retry delay ( seconds )