Project

General

Profile

Feature #2239

Remove multiple glideins at a time

Added by Parag Mhashilkar almost 8 years ago. Updated over 6 years ago.

Status:
New
Priority:
Low
Assignee:
Parag Mhashilkar
Category:
Factory
Target version:
Start date:
12/01/2011
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

Right now we remove glideins one at a time. This seems quite inefficient
for a busy factory. After talking to Dan (see below), I think we can
make this efficient.

On 11/30/11 7:55 PM, Parag Mhashilkar wrote:

Thanks a lot for detailed explanation.

So looks like it depends on transaction succeeding or not and if there is any
failure marking the job as removed, even a single job, transaction will fail
and exit status will be non zero.

So I can still change to the removal to multiple glideins with one command and
in case it exits with non-zero I can revert to removing a job at a time.

Also, one key factor seems to be "marked for removal" v/s "actually removing"
Marking for removal can still mean actual removal can fail and the job will go
back to hold state. What happens in suck scenarios? I am guessing, the
behavior would be, one (or all) jobs are marked for removal, condor_rm returns
0, but actual removing of one of the jobs failed and only that particular job
goes to held state. Is my assumption correct?

Yes.

--Dan

-Parag

-----Original Message-----
From: Dan Bradley
Sent: Wednesday, November 30, 2011 4:52 PM
To: Parag Mhashilkar
Subject: Re: condor_rm question

On 11/30/11 3:55 PM, Parag Mhashilkar wrote:

Hi Dan,

Right now we remove glideins 1 condorg job at a time with condor_rm.

I

am wondering if doing a bulk remove has any side effects. Is

condor_rm

transactional?

CASE 1:
condor_rm 4 5 6

CASE 2:
condor_rm -constraint "clusterid>3&& clusterid<7"

CASE 3:
condor_rm -constraint "clusterid==3 || clusterid==5 || clusterid==6"

In above cases if removing 5 fails what happens to the status of 4&

6?

The setting of job status to 'removed' happens in a transaction, so
either all jobs that can be removed will be marked for removal or none
of them will be. However, jobs that do not exist or which are already
marked for removal will be ignored.

Also what about the condor_rm exit status? How are constraints

treated

with || when condor can remove jobs with certain constraints and

while

some fail.

The exit status is 0 if at least one job was successfully removed. If
there is a communication failure or other failure to commit the
transaction, then no jobs will be successfully removed, so the exit
status will be non-zero. If the transaction is successfully committed
and this is communicated successfully to condor_rm, then the exit
status will be 0 and all matching jobs that existed and were not
already marked for removal will have been marked for removal. There is
no special treatment of constraints containing || clauses. All that
matters is whether the transaction was successfully committed, marking
all matching jobs for removal.

--Dan

History

#1 Updated by Parag Mhashilkar almost 8 years ago

  • Assignee set to Parag Mhashilkar

#2 Updated by Parag Mhashilkar over 7 years ago

  • Target version set to v2_7_x

#3 Updated by Burt Holzman over 7 years ago

  • Priority changed from Normal to Low

#4 Updated by Parag Mhashilkar over 6 years ago

  • Target version changed from v2_7_x to v3_x


Also available in: Atom PDF