Project

General

Profile

Open Science Grid » History » Version 35

« Previous - Version 35/43 (diff) - Next » - Current version
Yen-Chu Chen, 02/16/2018 03:32 PM


Open Science Grid

Running on the OSG

Ken Herner points out that "running a program on the OSG in general is the same as running a program on GPGrid. You just have to remember that certain resources (mainly Bluearc) will not be available, so you need to be getting your code from CVMFS or copying it in with the job. But your base script would in general be the same; you just have to make sure that when you're setting paths and ups products, for example, that you're not pointing to the versions that exist on Bluearc. As for job submission, since I think you're using jobaub, it is the same except you should change your resource-provides option to be --resource-provides=usage_model=OFFSITE."

For detailed information for running on the OSG, including best practices, please see Ken's presentation (SEAQUEST-doc-1466-v1).

OSG Test Suite:

Ken Herner's script osg-example.sh can be used to test the OSG environment as well as the OSG submission to a given OSG site. In the example below the OSG sites Caltech, FNAL (CMS Tier-1 not GPGrid), Michigan, MIT, Nebraska, NotreDame, Omaha, SU-OG, UCSD, Wisconsin are selected:

#!/bin/bash

 jobsub_submit -N 5 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE \
--disk=1GB file://osg-example.sh

Software Repository for the OSG:

The SeaQuest software distribution for the OSG is stored on our CERNvmFS repository. Bryan (Dannowitz), Brian, Kun, and Markus can access the repository via:
ssh -l cvmfsseaquest oasiscfs02.fnal.gov

The seaquest.opensciencegrid.org repository is synched with the seaquest-distribution repository:
  • We need to update first on seaquestgpvm0(1|2) the local repository in /grid/fermiapp/seaquest/software/
  • and then execute the ./sbin/sync-cvmfs-seaquest script on oasiscfs02.fnal.gov.

The script sync-cvmfs-seaquest is part of cron-daily and thus run once per day.

Running Tracking on the OSG:

The SeaQuest software distribution in /grid/fermiapp/seaquest/software can be set to run on the OSG with a single flag. If you run any scripts with an '--osg' flag, the job submission script will replace all references to '/grid/fermiapp/seaquest' with '/cvmfs/seaquest.opensciencegrid.org/seaquest' and designate the 'usage_models' to use 'OFFSITE' nodes only.

In general, with the tracking job submission scripts, the two steps to run tracking on the OSG is:

  1. Source the appropriate setup script to set all of your paths:
     source /grid/fermiapp/seaquest/software/current/setup.sh 
  2. Add --osg to the tracking scripts argument. For example, to test out running runKTracker.py:
    runKTracker.py --osg --track --run=9700 --first-event=0 --n-events=100 --inv=R005 --outdir=/pnfs/e906/scratch/users/USER --indir=/pnfs/e906/scratch --opts=/e906/app/users/liuk/seaquest/seaquest/ktracker/opts/57_2.opts

Running personal program on the OSG:

To compile your program for GRID :

In the worker nodes of OSG GRID, the seaquest software is available in '/cvmfs/seaquest.opensciencegrid.org/seaquest'. Which gives you the latest version of seaquest software.

So your program needs to be compiled in the same environment. As an example, my test program is AnaBackground.cc, which needs AnaBackground.h by default it analysis a root file, run_015782_R007_tgtDump_1_mix1.root.

  1. First of all save your program to seaquestgpvm01 (seaquestgpvm02) under the directory or yours, in my case, /e906/app/software/osg/users/chenyc. (If you don't have it, check with the seaquest computing manager.)
     scp -p AnaBackground.cc chenyc@seaquestgpvm01:/e906/app/software/osg/users/chenyc/
     scp -p AnaBackground.h chenyc@seaquestgpvm01:/e906/app/software/osg/users/chenyc/
    
  2. Log in to either seaquestgpvm01 or seaquestgpvm02, setup the software environment and compile the program.
     ssh -l chenyc seaquestgpvm01
     seaquestgpvm01> cd /e906/app/software/osg/users/chenyc
     seaquestgpvm01> source /grid/fermiapp/seaquest/software/current/setup.sh
     seaquestgpvm01> g++ -o AnaBackground AnaBackground.cc `root-config --cflags --glibs`
    
  3. The contents inside the area of /e906/app/software/osg/users/ will be sync to /cvmfs everyday. Or you could push for it following the instruction in the section of "Software Repository for the OSG".
    Then run a script pointing to the right directory to copy data the program to run on worker node.

To submit job to GRID :

  1. Alternatively you could copy the program to /pnfs/e906/scratch/users/ area, which is accessible in the worker node and is also the default place to save your output files, such as log files, resulted root files, etc.
    However this is scratch area so the files in here will be removed when it is too old (? months) and is not accessed.
    That is why you want to save and compile your program somewhere else.
    Save the result somewhere else if they worth it.
    Also, before submitting your job, check the existence of your files, such as data.
    There is some chances that they might be not there any more.
    Make this copy or check part your job preparation.
  2. Alternative way to run on OSG, taking mine as example,
     
     seaquestgpvm01> cp AnaBackground /pnfs/e906/scratch/users/chenyc/
     seaquestgpvm01> cp AnaBackground.cc /pnfs/e906/scratch/users/chenyc/
     seaquestgpvm01> cp AnaBackground.h /pnfs/e906/scratch/users/chenyc/
    
  3. In this exercise only AnaBackgroun executable is needed in the worker node as well as the data file. However the c program is copied over as well just so that it can be printed in the log file as part of remembering which program is used.
    Remember to make this file executable, once it is copied over into the worker node. Example is shown in the script, osg-example_Ana.sh.
  4. In this example, the data file is saved in /pnfs/e906/scratch/chenyc/roots/ area before the job submission.
  5. The script to be submitted is osg-example_Ana.sh. In which it sets up the environment, copy the program and data over. Then run the program. Finally copy the resulted files back to /pnfs area.
  6. Note that on the GRID you have lots of worker nodes at your disposal. There is environment variable you can use to mark your output files if you have many of them. It is the ${PROCESS}. You could also use ${CLUSTER} if you are submitting multiple jobs at the same time. Note that each worker node gets to execute esg-example_Ana.sh individually but they will be assigned different ${PROCESS} number. The ${CLUSTER} number is the same for a given job submission.
  7. To submit the job for this example, in you local directory, where you save the script, osg-example_Ana.sh:
     seaquestgpvm01> source /grid/fermiapp/seaquest/software/current/setup.sh (Do this if not done yet; it is OK if this is done multiple times.)
     seaquestgpvm01> jobsub_submit -N 1 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE --disk=1GB file://osg-example_Ana.sh
    

    where the number following "-N" is how many process does the job require?
  8. An example of submitting job with 2 processes:
     seaquestgpvm01> jobsub_submit -N 2 -G seaquest -M --OS=SL6 --resource-provides=usage_model=OFFSITE --disk=1GB file://osg-example_Ana.sh
     seaquestgpvm01> jobsub_q --user chenyc
     JOBSUBJOBID                           OWNER           SUBMITTED     RUN_TIME   ST PRI SIZE CMD
     4140569.0@jobsub02.fnal.gov           chenyc          02/16 14:56   0+00:00:00 I   0   0.0 osg-example_Ana.sh_20180216_145630_678936_0_1_wrap.sh 
     4140569.1@jobsub02.fnal.gov           chenyc          02/16 14:56   0+00:00:00 I   0   0.0 osg-example_Ana.sh_20180216_145630_678936_0_1_wrap.sh 
    

    In this example, "4140569" is the ${CLUSTER} number. "0" or "1" are the ${PROCESS} numbers for each process. One needs to design a way to specify the input file corresponding to each process so that they don't do the same file. This is up to each user.