Project

General

Profile

Fermilab/HTCondor Minutes December 11, 20156

Attendees:
Tony, Marco, Kevin, Steve, Todd T., Zach

Agenda and Notes:

gridmanager/gahp hangs:

Clarification about my start expression from last month ( to make one (or more) CPUs unavailable to regular jobs in a partitionable slot configuration). I did not understand well Zach reply.

  • will take offline, lots of details, easy to get bogged down

Mismatch between the Name of resources in condor_status and the RemoteHost in condor_q when using partitionable slots for glidieins:

what happens when both NUM_SLOTS and NUM_SLOTS_TYPE_XX are defined and they are inconsistent? Is num_slots just ignored? What if it has a different value? Is the order important?

      NUM_SLOTS=1
      NUM_SLOTS_TYPE_1=1
      NUM_SLOTS_TYPE_2=2
  • I'm working on ways to count the number of machines, slots and cores (cpus) in a glidein setting.
    • Partitionable slots make things tricky. I’m thinking on how to consider a partitionable slot with dynamic slots using only some of the CPUs:
      • count it as Idle if it has some Cpus
      • count it a Running if it has some running children (dynamic slots)
    • To count the cpus is it correct to use (in the partitionable slot): - Yes
      • Idle: Cpus
      • Running: TotalCpus - Cpus
    • Which is the best way to single out partitionable slots? - Either one should be fine
      • PartitionableSlot true
      • SlotType partition able