Project

General

Profile

Information about job submission to OSG sites » History » Version 49

« Previous - Version 49/158 (diff) - Next » - Current version
Kenneth Herner, 01/13/2016 01:07 PM


Information about job submission to OSG sites

This page captures some of the known quirks about certain sites when submitting jobs there.

What this page is

Most OSG sites will work with with the jobsub default requests of 2000 MB of RAM and 35 GB of disk, but at some sites there are some more strict limits. Additionally some sites only support certain experiments as opposed to the entire Fermilab VO. Here we list the OSG sites where users can submit jobs, along with all known cases where either the standard jobsub defaults may not work, or the site only supports certain experiments. Information on it is provided on a best-effort basis and is subject to change without notice.

What this page is NOT

This page is NOT a status board or health monitor of the OSG sites. Just because your submission fits in with the guidelines here does not mean that your job will start quickly. Nor does it keep track of downtimes at the remote sites. Its sole purpose is to help you avoid submitting jobs with disk/memory/cpu/site combinations that will never work. Limited offsite monitoring is available from https://fifemon.fnal.gov:3000/dashboard/db/offsite-monitoring

Organization

The following table lists the available OSG sites, their Glidein_site name (what you should put in the --site option), what experiment(s) the site will support, and finally and known limitations on disk, memory, or CPU.

NOTE: In many cases you may be able to request more than the jobsub defaults and be fine. We have not done detailed testing at each site to determine what the real limits are. If you do try a site and put in requirements that exceed the jobsub defaults, sometimes a

jobsub_q --better-analyze --jobid=<your job id>

will give you useful information about why a job doesn't start to run
(i.e. it may recommend lowering the disk or memory requirements to a certain value.)

NOTE 2: Under supported experiments, "All" means all experiments except for CDF, D0, and LSST. It does include DES and DUNE.

NOTE 3: The estimated maximum lifetim is just an estimate based on a periodic sampling of glideins. It may change from time to time.

Site Name name for --site option Supported Experiments Known limitations Estimated maximum job lifetime
University of Bern UNIBE-LHEP uboone only 2000 MB memory or less; use >1 cpu for more memory 48 h
Brookhaven National Laboratory BNL All jobsub defaults are OK 24 h
Caltech T2 Caltech All jobsub defaults are OK; can go up to --memory=3000 25 h
Cornell Cornell All jobsub defaults are OK unknown
Fermigrid FERMIGRID disabled not working right now; waiting for transition to new GPGrid unknown
Fermi private cloud All memory up to 7500
--resource-provides usage_model=FERMICLOUD_PRIV,FERMICLOUD_PP_PRIV
4 days
FNAL CMS Tier 1 FNAL All jobsub defaults are OK 48 h
Czech Academy of Sciences FZU NOvA only request --disk=20000MB or less unknown
Harvard Harvard NOvA only jobsub defaults are OK; SL5 only unknown
University of Washington Hyak_CE All available resources vary widely; --memory=1900MB or less is better 3.5 h
ATLAS Great Lakes Tier 2 (AGLT2) Michigan All jobsub defaults are OK unknown
MIT MIT All + CDF jobsub defaults are OK unknown
Midwest Tier2 MWT2 All jobsub defaults are OK; single core jobs will take a very long
time to run if requesting more than 1920 MB of memory
5 h
Red Nebraska All jobsub defaults are OK 48 h
Notre Dame NotreDame All jobsub defaults are OK; can go up to --memory=2500 ; aim for short jobs due to preemption 24 h
Tusker/Crane Omaha All jobsub defaults are OK; can go up to --memory=3000 24 h
Ohio Supercomputing Center OSC NOvA only jobsub defaults are OK 48 h
Southern Methodist University SMU NOvA only jobsub defaults are OK; SL5 only unknown
Southern Methodist SMU_HPC NOvA only single core jobs should request --memory=2500 or less
2015/11/24 gatekeeper offline INC000000628522
24 h
Syracuse SU-OG All request --disk=9000MB or less and --memory=2500MB or less
2015/10/15 libXpm.so not installed (may be required by ROOT) on all nodes
48 h
Texas Tech TTU All but mu2epro and seaquest jobsub defaults are OK unknown
2015/11/20 down since OSG software upgrade
unknown
University of Chicago UChicago All linked with MWT2; recommend --memory=1920 or less for single-core jobs 5 h
University of California, San Diego UCSD All jobsub defaults are OK; can go up to --memory=4000 13 h
Grid Lab of Wisconsin (GLOW) Wisconsin All jobsub defaults are OK 24 h
Western Tier2 (SLAC) WT2 uboone only jobsub defaults are OK; can go up to --memory=2500MB 10 days