Project

General

Profile

Offsite Probe jobs for NOvA ran by OPOS (Working in progress)

Motivation

It is useful to know which sites are accepting jobs with some specific resource requirements.
Some of the resource requirements that is worth to monitor are: Memory requirement, Disk requirement, expected lifetime.
Usually the Disk requirement is not an issue.

In the first approach I'm running probe jobs to monitor the upper limit of memory allowed by off-sites sites that accept jobs from NOvA.

The sites involved in this activity are:

BNL, Caltech, Clemson, Cornell, FZU, Harvard, Hyak_CE, MIT, MWT2, Michigan, Nebraska, NotreDame, OSC, Omaha, SMU_HPC, SU-OG, TTU, UCSD, UChicago, Wisconsin.

Maximum memory (in Mb) allowed by site: data from 2016­-03-­25 to 2016­-03-­31

Site 2016-03-25 2016-03-26 2016-03-27 2016-03-28 2016-03-29 2016-03-30 2016-03-31
BNL 3000 3000 3000 3000 3000 3000 3000
Caltech NA NA NA NA 3500 3500 NA
Clemson 1500 1500 1500 NA 1500 1500 1500
Cornell NA 2500 2500 NA 2500 2500 NA
FZU 2500 2500 2500 2500 2500 2500 NA
Harvard NA NA NA NA NA NA NA
Hyak_CE 1500 1500 1500 1500 1500 1500 1500
MIT NA NA NA NA NA NA NA
MWT2 NA NA NA NA NA NA NA
Michigan 2500 2500 2500 2500 2500 2500 2500
Nebraska NA 1500 1500 1500 1500 1500 NA
NotreDame 2500 2500 2500 2500 2500 2500 2500
OSC 2500 2500 2500 NA 2500 2500 2500
Omaha NA 1500 1500 1500 1500 NA NA
SMU NA NA NA NA NA NA NA
SMU_HPC NA NA NA NA NA NA NA
SU-OG 2500 2500 2500 2500 2500 2500 2500
TTU NA NA NA NA NA NA NA
UCSD NA 1500 1500 1500 1500 1500 1500
UChicago NA NA NA NA NA NA NA
Wisconsin NA 5000 5000 5000 NA NA NA

Site performances.

The time information are from SAM. Should be used information from the condor log to be more accurate.

Site Max mem (Mb) min waiting time (min) avg waiting time (min) max waiting time (min) min processing time (min) avg processing time (min) max processing time (min)
BNL 3000 1.0 92.0 288.0 61.0 137.0 505.0
Caltech 3500 2.0 274.0 545.0 116.0 175.0 285.0
Clemson 1500 6.0 18.0 38.0 71.0 93.0 154.0
Cornell 2500 396.0 1186.0 1820.0 4.0 390.0 1441.0
FZU 2500 1.0 433.0 1825.0 44.0 98.0 152.0
Hyak_CE 1500 67.0 218.0 384.0 55.0 83.0 118.0
Michigan 2500 1.0 101.0 483.0 54.0 76.0 125.0
Nebraska 1500 78.0 447.0 1170.0 31.0 67.0 110.0
NotreDame 2500 1.0 30.0 93.0 46.0 90.0 163.0
OSC 2500 1.0 37.0 217.0 47.0 61.0 105.0
Omaha 1500 176.0 1264.0 2705.0 47.0 142.0 356.0
SU-OG 2500 1.0 174.0 598.0 14.0 126.0 541.0
UCSD 1500 2.0 744.0 2685.0 47.0 136.0 454.0
Wisconsin 5000 1.0 150.0 365.0 106.0 233.0 476.0

(*) is the total number of successful jobs ran on the site within the specified time range.

Site Max mem (Mb) #days jobs succeeded (**)
BNL 3000 7
Caltech 3500 2
Clemson 1500 6
Cornell 2500 4
FZU 2500 6
Hyak_CE 1500 7
Michigan 2500 7
Nebraska 1500 5
NotreDame 2500 7
OSC 2500 6
Omaha 1500 4
SU-OG 2500 7
UCSD 1500 6
Wisconsin 5000 3

(**) is the number of day jobs succeeded to run on the site requiring the indicated maximum memory.