Project

General

Profile

Open Science Grid » History » Version 40

« Previous - Version 40/43 (diff) - Next » - Current version
Yen-Chu Chen, 02/28/2018 11:57 AM


Open Science Grid

Running on the OSG

Ken Herner points out that "running a program on the OSG in general is the same as running a program on GPGrid. You just have to remember that certain resources (mainly Bluearc) will not be available, so you need to be getting your code from CVMFS or copying it in with the job. But your base script would in general be the same; you just have to make sure that when you're setting paths and ups products, for example, that you're not pointing to the versions that exist on Bluearc. As for job submission, since I think you're using jobaub, it is the same except you should change your resource-provides option to be --resource-provides=usage_model=OFFSITE."

For detailed information for running on the OSG, including best practices, please see Ken's presentation (SEAQUEST-doc-1466-v1).

OSG Test Suite:

Ken Herner's script osg-example.sh can be used to test the OSG environment as well as the OSG submission to a given OSG site. In the example below the OSG sites Caltech, FNAL (CMS Tier-1 not GPGrid), Michigan, MIT, Nebraska, NotreDame, Omaha, SU-OG, UCSD, Wisconsin are selected:

#!/bin/bash

 jobsub_submit -N 5 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE \
--disk=1GB file://osg-example.sh

Software Repository for the OSG:

The SeaQuest software distribution for the OSG is stored on our CERNvmFS repository. Bryan (Dannowitz), Brian, Kun, and Markus can access the repository via:
ssh -l cvmfsseaquest oasiscfs02.fnal.gov

The seaquest.opensciencegrid.org repository is synched with the seaquest-distribution repository:
  • We need to update first on seaquestgpvm0(1|2) the local repository in /grid/fermiapp/seaquest/software/
  • and then execute the ./sbin/sync-cvmfs-seaquest script on oasiscfs02.fnal.gov.

The script sync-cvmfs-seaquest is part of cron-daily and thus run once per day.

Running Tracking on the OSG:

The SeaQuest software distribution in /grid/fermiapp/seaquest/software can be set to run on the OSG with a single flag. If you run any scripts with an '--osg' flag, the job submission script will replace all references to '/grid/fermiapp/seaquest' with '/cvmfs/seaquest.opensciencegrid.org/seaquest' and designate the 'usage_models' to use 'OFFSITE' nodes only.

In general, with the tracking job submission scripts, the two steps to run tracking on the OSG is:

  1. Source the appropriate setup script to set all of your paths:
     source /grid/fermiapp/seaquest/software/current/setup.sh 
  2. Add --osg to the tracking scripts argument. For example, to test out running runKTracker.py:
    runKTracker.py --osg --track --run=9700 --first-event=0 --n-events=100 --inv=R005 --outdir=/pnfs/e906/scratch/users/USER --indir=/pnfs/e906/scratch --opts=/e906/app/users/liuk/seaquest/seaquest/ktracker/opts/57_2.opts

Running personal program on the OSG:

To compile your program for GRID :

In the worker nodes of OSG GRID, the seaquest software is available in '/cvmfs/seaquest.opensciencegrid.org/seaquest'. Which gives you the latest version of seaquest software.

So your program needs to be compiled in the same environment. As an example, my test program is AnaBackground.cc, which needs AnaBackground.h by default. it analyses a root file, run_015782_R007_tgtDump_1_mix1.root.

  1. First of all save your program to seaquestgpvm01 (seaquestgpvm02) under the directory or yours, in my case, /e906/app/software/osg/users/chenyc. (If you don't have it, check with the seaquest computing manager (currently Andrew at .)
     scp -p AnaBackground.cc chenyc@seaquestgpvm01:/e906/app/software/osg/users/chenyc/
     scp -p AnaBackground.h chenyc@seaquestgpvm01:/e906/app/software/osg/users/chenyc/
    
  2. Log in to either seaquestgpvm01 or seaquestgpvm02, setup the software environment and compile the program.
     ssh -l chenyc seaquestgpvm01
     seaquestgpvm01> cd /e906/app/software/osg/users/chenyc
     seaquestgpvm01> source /grid/fermiapp/seaquest/software/current/setup.sh
     seaquestgpvm01> g++ -o AnaBackground AnaBackground.cc `root-config --cflags --glibs`
    
  3. The contents inside the area of /e906/app/software/osg/users/ will be sync to /cvmfs everyday. Or you could push for it, following the instruction in the section of "Software Repository for the OSG".
    Then run a script pointing to the right directory to copy data and the program to run on worker node.

To submit job to GRID :

  1. Alternatively you could copy the program to /pnfs/e906/scratch/users/ area, which is accessible in the worker node and is also the default place to save your output files, such as log files, resulted root files, etc.
    However this is scratch area so the files in here will be removed when it is too old (? months) and is not accessed.
    That is why you want to save and compile your program somewhere else.
    Save the result somewhere else if they worth it.
    Also, before submitting your job, check the existence of your files, such as data files.
    There is some chances that they might be not there any more.
    Make this coping or checking part your job preparation.
  2. Alternative way to run on OSG, taking mine as example,
     
     seaquestgpvm01> cp AnaBackground /pnfs/e906/scratch/users/chenyc/
     seaquestgpvm01> cp AnaBackground.cc /pnfs/e906/scratch/users/chenyc/
     seaquestgpvm01> cp AnaBackground.h /pnfs/e906/scratch/users/chenyc/
    
  3. In this exercise only AnaBackground executable is needed in the worker node as well as the data file. However the c program is copied over as well just so that it can be printed in the log file as part of remembering which program is used.
    Note that when the file, AnaBackground, is copied over, the file attribution is altered.
    Remember to make this executable again. Example is shown in the script, osg-example_Ana.sh.
  4. In this example, the data file is saved in /pnfs/e906/scratch/chenyc/roots/ area before the job submission.
  5. The script to be submitted is osg-example_Ana.sh. In which it sets up the environment, copy the program and data over. Then run the program. Finally copy the resulted files back to /pnfs area.
  6. Note that on the GRID you have lots of worker nodes at your disposal. There is environment variable you can use to mark your output files if you have many of them. It is the ${PROCESS}. You could also use ${CLUSTER} if you are submitting multiple jobs at the same time. Note that each worker node gets to execute esg-example_Ana.sh individually but they will be assigned different ${PROCESS} number. The ${CLUSTER} number is the same for a given job submission.
  7. To submit the job for this example, in you local directory, where you save the script, osg-example_Ana.sh:
     seaquestgpvm01> source /grid/fermiapp/seaquest/software/current/setup.sh (Do this if not done yet; it is OK if this is done multiple times.)
     seaquestgpvm01> jobsub_submit -N 1 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE --disk=1GB file://osg-example_Ana.sh
    

    where the number following "-N" is how many process does the job require?
  8. An example of submitting job with 2 processes:
     seaquestgpvm01> jobsub_submit -N 2 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE --disk=1GB file://osg-example_Ana.sh
     seaquestgpvm01> jobsub_q --user chenyc
     JOBSUBJOBID                           OWNER           SUBMITTED     RUN_TIME   ST PRI SIZE CMD
     4140569.0@jobsub02.fnal.gov           chenyc          02/16 14:56   0+00:00:00 I   0   0.0 osg-example_Ana.sh_20180216_145630_678936_0_1_wrap.sh 
     4140569.1@jobsub02.fnal.gov           chenyc          02/16 14:56   0+00:00:00 I   0   0.0 osg-example_Ana.sh_20180216_145630_678936_0_1_wrap.sh 
    

    In this example, "4140569" is the ${CLUSTER} number. "0" or "1" are the ${PROCESS} numbers for each process. One needs to design a way to specify the input file corresponding to each process so that they don't do the same file. This is up to each user.

An example to submit root macro to OSG :

It is possible to run root macro on the worker node of OSG. However it doesn't make sense to compile the same code hundreds or thousands times on these worker nodes. So even though it is root macro, it is still better to compile it before submitting, using the same version of root available on the worker nodes.

In this example, the code is doing the same as the one in the previous sample; it is AnaMain.C, which include AnaBackground.C and AnaBackground.h. After compilation, AnaBackground_C.so is generated. Which need to be downloaded to worker nodes together with AnaMain.C. In the example script, I copied AnaBackground.C and AnaBackground.h as well. These are not really needed unless one wants to try to compile on the worker node; which then requires additional setup to obtain g++.

I compile the code in /e906/app/software/osg/users/chenyc/ area.

 seaquestgpvm01> cd /e906/app/software/osg/users/chenyc/
 seaquestgpvm01> source /grid/fermiapp/seaquest/software/current/setup.sh
 seaquestgpvm01> source ${SEAQUEST_INSTALL_ROOT}/externals/root/root-5.34.28/bin/thisroot.sh
 seaquestgpvm01> root -l -q -b -n AnaCompile.C

The last command compile and execute the root macro, AnaCompile.C.

Before submitting to OSG, one should commented out a line in the AnaMain.C, which one might do in local directory.

// gROOT->LoadMacro("AnaBackground.C+");

so that it doesn't go on to compile again inside the worker node.

Next step is to copy the needed files to /pnfs/e906/scratch/users/chenyc/ area.

 seaquestgpvm01> cp AnaMain.C /pnfs/e906/scratch/users/chenyc/
 seaquestgpvm01> cp AnaBackground_C.so /pnfs/e906/scratch/users/chenyc/
 seaquestgpvm01> cp AnaBackground.C /pnfs/e906/scratch/users/chenyc/
 seaquestgpvm01> cp AnaBackground.h /pnfs/e906/scratch/users/chenyc/

I am submitting jobs from /seaquest/users/chenyc/grid_sub/ directory. Where I have osg-example_root.sh.

 seaquestgpvm01> cd /seaquest/users/chenyc/grid_sub/
 seaquestgpvm01> jobsub_submit -N 82 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE --disk=1GB file://osg-example_root.sh

Note that this job submission ask for 82 processes. Each process takes one input root file.

The resulted root files are transferred back to /pnfs/e906/scratch/users/chenyc/roots_out/ .