Project

General

Profile

Feature #11408

Provide option to set CPUs to the number provided by the job manager (instead of the HW one)

Added by Marco Mambelli over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Category:
Glidein
Target version:
Start date:
01/12/2016
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

GLIDEIN_CPUS controls the number of cpus set in the glidein (NUM_CPUS):
- default is 1 (if not set)
- sets the number if >0
- 0/auto, defect the HW cpus and set to that number

Some job manages may override this (notably condor if the slot used by the glidien is not using the whole node or if NUM_CPUS is set in it).
There should be a option to set the CPUs to that number either:
1. changing the behavior of auto (the old behavior could be kept with a new keyword 'hardware')
2. adding a new option 'soft' that detects the offered CPUs

History

#1 Updated by Marco Mambelli over 4 years ago

The code is in creation/web_base/glidein_cpus_setup.sh:

elif [ "${GLIDEIN_CPUS}" = "0" ]; then
    # detect the number of cores
    core_proc=`awk -F: '/^physical/ && !ID[$2] { P++; ID[$2]=1 }; /^physical/ { N++ };  END { print N, P }' /proc/cpuinfo`
    cores=`echo "$core_proc" | awk -F' ' '{print $1}'`
    if [ "$cores" = "" ]; then
        # Old style, no multiple cores or hyperthreading
        cores=`grep processor /proc/cpuinfo  | wc -l`
    fi
    GLIDEIN_CPUS="$cores" 
fi

#2 Updated by Marco Mambelli over 4 years ago

  • Status changed from New to Feedback

added the ability to select 'slot' to automatically detect the cpus made available form a htcondor slot
code in v3/11408

#3 Updated by Marco Mambelli over 4 years ago

  • Assignee changed from Marco Mambelli to Parag Mhashilkar
  • Target version set to v3_2_13

#4 Updated by Parag Mhashilkar over 4 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mambelli

This is useful and maybe we can expand this to other commonly used batch systems?

Check glidein_sitewms_setup.sh on how to detect batchsystem in use.

Please verify the docs before changing the code or maybe check with some sites/submit jobs to those sites for environment.

Detecting cores from other batch systems:

#5 Updated by Marco Mambelli over 4 years ago

  • Assignee changed from Marco Mambelli to Parag Mhashilkar

Implemented for almost all suggested LRMs.
SGE is the only one missing and was unable to find a solution or someone w/ SGE able to help.
The test was done emulating the environment variables (except htcondor that was available to test).

#6 Updated by Parag Mhashilkar over 4 years ago

  • Assignee changed from Parag Mhashilkar to Marco Mambelli

Looks ok to merge

#7 Updated by Parag Mhashilkar over 4 years ago

  • Status changed from Feedback to Resolved

merged

#8 Updated by Parag Mhashilkar over 4 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF