Project

General

Profile

Feature #9957

Improved Offsite Running

Added by Alexander Himmel about 5 years ago. Updated about 5 years ago.

Status:
Assigned
Priority:
Normal
Start date:
08/28/2015
Due date:
% Done:

0%

Estimated time:
Duration:

Description

Try our nova production jobs and various offsite clusters, see what works and what doesn't.

Success.png (44.6 KB) Success.png Enrique Arrieta Diaz, 09/22/2015 05:06 PM
Full_job_time.png (48.7 KB) Full_job_time.png Enrique Arrieta Diaz, 09/22/2015 05:06 PM
Usage.png (38.1 KB) Usage.png Enrique Arrieta Diaz, 09/22/2015 05:07 PM
Performance.png (42.9 KB) Performance.png Enrique Arrieta Diaz, 09/22/2015 05:07 PM

History

#1 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Performance.png added

Offsite locations performance

The performance of the offsite sites is represented by a performance score that runs continuously from 1 to 16, where 1 is the best possible performance. This score takes into account:

  • Success Rate, S: number of completed jobs / number of submitted jobs.
  • job_time, J: the time elapsed between the first job starts and the end of the last job.
  • idle_time< I: the time elapsed between submission of the jobs and the start of the first job.
  • Average Time Per File, A.

Score = (S+J+I+A)/4.

The highest success rate gets a 1 and the lowest gets a 16. The lowest: job time, idle time, and average time per file get a 1, and the highest get a 16. If two or more sites tie in their positions they are assigned the same number.

Sites with the lowest performance scores are recommended.

The site named: Offsite, represents the jobs sent offsite using the option: _--offsite_only.

Implementing the offsite locations performance measure is a work in progress.

#2 Updated by Enrique Arrieta Diaz about 5 years ago

  • File success.png added

Success Rate

#3 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Full.png added

Full Job Time

#4 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Idle.png added

Idle Time

#5 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Usage.png added

Offsite Locations Share

The figure presents the percentage of jobs that are sent to the various locations when the --offsite_only option is used.

#6 Updated by Enrique Arrieta Diaz about 5 years ago

  • File success.png added

Enrique Arrieta Diaz wrote:

Success Rate

#7 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Full.png added

Enrique Arrieta Diaz wrote:

Full Job Time

#8 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Idle.png added

Enrique Arrieta Diaz wrote:

Idle Time

#9 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Idle.png)

#10 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Idle.png added

Enrique Arrieta Diaz wrote:

Idle Time

#11 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Performance.png added

#12 Updated by Enrique Arrieta Diaz about 5 years ago

  • File Usage.png added

Enrique Arrieta Diaz wrote:

Offsite Locations Share

The figure presents the percentage of jobs that are sent to the various locations when the --offsite_only option is used.

#13 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Performance.png)

#14 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (success.png)

#15 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Full.png)

#16 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Idle.png)

#17 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Usage.png)

#18 Updated by Enrique Arrieta Diaz about 5 years ago

The first test included in the plots used the mccheckoutjob.fcl, and 2GB of requested memory. The average time per file was 2 minutes and 12 seconds.

The second and third tests included in the plots used mccheckoutjob.fcl, and 2.4GB of requested memory. The average time per file was 2 minutes and 13 seconds.

The forth test included in the plots used prod_reco_pidpart_numi_job.fcl, and 2GB of requested memory. The average time per file was 126 minutes.

#19 Updated by Alexander Himmel about 5 years ago

  • Status changed from New to Assigned

#20 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (success.png)

#21 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Full.png)

#22 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Idle.png)

#23 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Performance.png)

#24 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (Usage.png)

#25 Updated by Enrique Arrieta Diaz about 5 years ago

  • File success.png added

Success Rate Plot

#26 Updated by Enrique Arrieta Diaz about 5 years ago

  • File job_times.png added

Job times plot

#27 Updated by Enrique Arrieta Diaz about 5 years ago

  • File share.png added

Share plot

#28 Updated by Enrique Arrieta Diaz about 5 years ago

  • File performance.png added

Performance plot

#29 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (success.png)

#30 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (job_times.png)

#31 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (share.png)

#32 Updated by Enrique Arrieta Diaz about 5 years ago

  • File deleted (performance.png)

#33 Updated by Enrique Arrieta Diaz about 5 years ago

Success plot

#34 Updated by Enrique Arrieta Diaz about 5 years ago

Full Job Times Plot

#35 Updated by Enrique Arrieta Diaz about 5 years ago

Offsite share plot

#36 Updated by Enrique Arrieta Diaz about 5 years ago

Performance Plot



Also available in: Atom PDF