Project

General

Profile

Using CVMFS

Welcome to MINOS CVMFS!

CERN Virtual Machine File System is a distributed disk system for providing an experiment's code and libraries to interactive
node and grids worldwide. It is used by CMS and Atlas and well as most experiments at FNAL.

The code manager copies a code release to a CVMFS work space and "publishes" it. This process examines the code, compresses it,
and inserts it in a database. The original database is called the tier 0 copy. Remote sites may support tier 1 copies of the
database, synced to the tier 0.

The user's grid job sees a CVMFS disk mounted and containing a copy of the experiment's code, which can be accessed in any way
the code would be accessed on a standard disk. The disk is actually a custom nfs server with a small ( ~8 GB) local cache on
the node and a backend that sends file requests to a squid web cache. The squid may get its data from the tier 1 database, if
available, or from the tier 0. As a practical matter, most grid jobs do not access much in a release, usually just a small set
of shared object libraries, and these end up cached on the worker node, or on the squid, thereby avoiding a long-distance
network transfer.

CVMFS is efficienct only for distributing code and small data files which are required by a large number of nodes on the grid.
On the other hand, datasets, such as event data files, are many files which are each sent to only one node during a grid job.
CVMFS is not efficienct for this type of data distribution or for this sort of data volume. Data files should be distributed
through dCache, which is designed to deliver each file to one node, and to handle the data volume. A single large file which
is to be distributed to all nodes also need to be avoided since it would would churn or overflow the small local caches.
Examples of this sort of file are the Genie flux files or a large analysis fit template library.

The limitations of CVMFS can be found here (don't worr too much about this unless you maintain CVMFS)
http://cernvm.cern.ch/portal/filesystem/repository-limits

How to submit a job for OFFSITE

This section will explain what your job.sh script and submit.sh script should look like. Get into the habit of submitting jobs like this, this way works for both ONSITE and OFFSITE.

I have made some changes to the jobha.sh script that is loaded from performing.

setup_jobsub

The first change is to create a command

jobsub_offsite

This has the conditions

--resource-provides=usage_model=OFFSITE 
--site=Caltech,BNL,Michigan

The recommended sites from fife are --site=Wisconsin,Nebraska,Omaha,SU-OG,NotreDame,Caltech,BNL,UCSD,Michigan. However, i have been unable to get the other sites to work. Minor errors which need addressing.

A useful website to see a list of available sites is
https://cdcvs.fnal.gov/redmine/projects/fife/wiki/Information_about_job_submission_to_OSG_sites

This has a list of requirements that one must uphold to successfully have a job land on that site. By default, for now, jobsub_offsite uses

--disk=5GB --memory=1GB --expected-lifetime=2h

To submit a job the process is identical to onsite. Inside grid.sh the only command is

jobsub_offsite -100 ${path}submit.sh

This will send 100 jobs offsite to currently to BNL, Caltech and Michigan. This submit.sh script will work both off and on site. Therefore you could split your jobs up 50/50

jobsub_offsite -50 ${path}submit.sh
jobsub -50 ${path}submit.sh

Offsite jobs have a tendency to take a long time to ramp up - but they will start eventually if you have only a few jobs and want them done fast, I suggest getting them done onsite. Offsite is effective for large volumes.

Submit script

below is a submission script one could use as a template. The key features to take note of our the fact you have to tar your entire test release (or just the bits yoru job needs). Everything then goes into PNFS
where it will be copied over to the offsite node through IFDH

#!/usr/bin/env bash                                                                                                                                                                                   

# Lets get some details                                                                                                                                                                               
echo "THIS IS OFFICALLY THE START OF MY SUBMISSION SCRIPT." 
echo Start  `date`
echo "the worker node is " `hostname` "OS: "  `uname -a`
echo "You are running as user `whoami`" 

# I need to do this so IFDH setup will work                                                                                                                                                                                             
case `uname -r` in
    3.*) export UPS_OVERRIDE="-H Linux+2.6-2.12";;
    4.*) export UPS_OVERRIDE="-H Linux+2.6-2.12";;
esac

# Always cd to the scratch area                                                                                                                                                                   
cd $_CONDOR_SCRATCH_DIR
echo " " 
echo "MY CONDOR SCRATCH = ${_CONDOR_SCRATCH_DIR}" 

# I like keeping track of this                                                                                                                                                                        
mypwd=`pwd`

# source the correct thing so i can use the "setup" command                                                                                                                                           
echo "source /cvmfs/fermilab.opensciencegrid.org/products/common/etc/setup" 
source /cvmfs/fermilab.opensciencegrid.org/products/common/etc/setup

# Setup IFDH                                                                                                                                                                                          
echo "setup ifdhc v1_6_2 -z /cvmfs/fermilab.opensciencegrid.org/products/common/db" 
setup ifdhc v1_6_2 -z /cvmfs/fermilab.opensciencegrid.org/products/common/db

# These are from dCache         
# I need to tar my test release and inputfiles and place them in PNFS

ifdh cp /pnfs/minos/scratch/users/${username}/testrelease.tar.gz       ${mypwd}/testrelese.tar.gz
ifdh cp /pnfs/minos/scratch/users/${username}/inputfiles.tar.gz          ${mypwd}/inputfiles.tar.gz

# Now i untar everything                                                                                                                                                                              
echo "Now untar-ing" 
tar -zxf testrelease.tar.gz
tar -xf inputfiles.tar.gz

echo "Setting up Test Release" 
echo "/cvmfs/minos.opensciencegrid.org/minos/minossoft/setup/setup_minossoft_FNALU.sh -r R3.08 -O" 
source /cvmfs/minos.opensciencegrid.org/minos/minossoft/setup/setup_minossoft_FNALU.sh -r R3.08 -O

# go into the dir to perform srt_setup -a                                                                                                                                                             
cd NameOfMyTestRelease/
srt_setup -a
cd -

# My arguements                                                                                                                                                                                       
global=$PROCESS
arguement2=${2}
arguement3=${1}

# Now run the job, remeber to run it from the tarball'd test release                                                                                                                                  
echo "Running the job macro" 

loon -q ./NameOfMyTestRelease/LoadLib.C ./NameOfMyTestRelease/macro.C\(\"./somepath\",${global},${arguement2},${arguement3}\)

echo "chmod 777 on output file before it goes to PNFS" 
chmod 777 outputfile.root

ifdh cp outputfile.root /pnfs/minos/scratch/users/${username}/outputfile.root

echo "End of bash submission script."