Project

General

Profile

Frequently Asked Questions

User Questions

Help! Something doesn't work like I think it should!

  • Open a service desk ticket at https://fermi.service-now.com/navpage.do.
    • Select 'Service Catalog' then under Scientific Computing select
      • I'm having a problem with Scientific Computing - if something that once worked is broken
      • Create a New Scientific Computing Request - if you want something new
        You can comment on tickets and communicate with experts regarding the problem by responding to emails from "Fermilab Service Desk" that are associated with your ticket. (They will have subject lines that look like "Incident INC000000nnnnn -- <problem description>".) You can also track, comment, and search for tickets by logging onto the Fermilab Service Desk site (http://servicedesk.fnal.gov) using your services account username and password.

How many slots does my experiment have on GPGrid?

Each experiment has a quota of slots on GPGrid that can be examined by going to this page:

https://fifemon.fnal.gov/monitor/dashboard/db/fifebatch-quotas

Note that since GPGrid doesn't perform pre-emption, there are times when an experiment/project will not have its full allocation of slots running with potential jobs in the queue. But know that an experiment with jobs in the queue which has less than its allocation running will be first in line when a slot becomes available. Opportunistic slots are allocated based on fairshare (http://research.cs.wisc.edu/htcondor/manual/v8.0/3_4User_Priorities.html#28293). Also, a "slot" is here defined as 1 cpu and 2000 MB of memory. Jobs requesting more than this baseline, in either cores or memory, may thus count as more than one slot for purposes of determining how much quota is being used at a given moment.

Where is ups/upd for my experiment?

  • To use a basic ups setup:
    • source /cvmfs/fermilab.opensciencegrid.org/products/common/etc/setup
      (this works on fermigrid nodes as well as local). Your experiment repo may also have a similar setup script
    • setup (whatever software you want to use)
      i.e. setup jobsub_tools; setup ifdhc; setup root and they are in your PATH with libraries in LD_LIBRARY_PATH.

I am a DUNE VO user. How do I register for the VO?

  • if you intend to run DUNE jobs with jobsub, you first need to make sure you are a member of the DUNE Virtual Organization (VO). For becoming a member of the DUNE VO, fill out the Experiment/Affiliation Computing Account Request Form on the Fermilab Service Desk, and choose DUNE as the experiment. The Service Desk form is the recommended way.

I need a version of a package that is not in the ups setup. How can I get this installed?

  • Open a problem ticket by sending a mail to [minerva|nova]-. Specify the package and version you need, and which experiment for which you need it.

How can I log into a Fermilab kerberized machine from my non-kerberized laptop?

My cronjob fails with the message "permission denied". What's going on?

  • To access /afs you need a token. To get this you need to use kcron, like this:
     35 11 * * *  /usr/krb5/bin/kcron your.script
    
  • You must run "kcroninit" (/usr/krb5/bin/kcroninit) once on each node where you will use kcron
  • You can test that kcron is working interactively.

How many jobs can I submit to the Fifebatch system?

We ask that users adhere to the following guidelines when submitting jobs:

1. Submit jobs at a rate not exceeding 1K jobs/minute (so wait at least 10 minutes after submitting a cluster of 10K)
2. Submit no more jobs at a time than could reasonably run within a few days to a week. For an experiment with 1000-job quota, this would be around 100K 2-hour jobs (1000 quota * 7 * 24 / 2 = 84K). For longer jobs or higher-memory jobs, the numbers would go down accordingly. Consult with your experiment's offline computing coordinator to find out what your experiment's quota is and how much of that you could expect to get.