Project

General

Profile

Support #2942

frontend and glidein pilot use of multiple schedds in user collector

Added by John Weigand over 7 years ago. Updated over 6 years ago.

Status:
Assigned
Priority:
Low
Assignee:
Parag Mhashilkar
Category:
Frontend
Target version:
Start date:
08/30/2012
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

I discovered this anomaly while testing another bug.

If the frontend has 2 schedds defined in its configuration
and the user collector has 3+ schedds in service,
- the frontend correctly only queries those 2 schedds
- the glidein pilots, however, pull jobs from all 3 schedds
attached to the user collector.

This occurs both when schedd shared port use is in effect and is not.

Maybe this is an undocumented feature or a bug.

The problems (maybe) I see are:
1. the frontend is not aware of the total demand for services (glidein pilots).
2. the frontend, querying only 2 of the schedds, potentially sees
no demand while the pilots are merrily pulling jobs from the other schedds.
3. maybe (this I am totally not sure of) the additional schedds are intended
for another purpose and not intended for glideinWMS use.

If this behavior is ok, then my question would be what is the purpose of
allowing the frontend to choose specific schedds to query for jobs.

This could be just a documentation issue or code issue.

Just thought I would bring this up for discussion.

John Weigand

History

#1 Updated by Igor Sfiligoi over 7 years ago

  • Tracker changed from Bug to Support
  • Status changed from New to Feedback

This is not really specific to the multiple-schedd (on one node) setup;
it is a generic Condor problem/feature.

If you have multiple schedd nodes, the situation is the same.

The FE admin just has to keep up with the evolution of the Condor pool.

The reason we have to specify the schedds in the FE config is due to the security mechanisms;
there is no way for the FE to securely talk to any schedd unless it has its DN.
And the FE cannot just discover it on its own.

If you can find a smart way to document this, sure go ahead.
But at least in my head, this is to be expected.

#2 Updated by Parag Mhashilkar over 7 years ago

This can result in jobs from wrong schedds sucking up the pool. I wonder if we can make the startd's requirement to match jobs only from list of schedds configured in frontend

#3 Updated by Burt Holzman about 7 years ago

  • Assignee set to Igor Sfiligoi

eh, nevermind -- saw the wrong view of this ticket. Reassigning to Parag to see if he can think of an implementation for the startd requirements.

#4 Updated by Burt Holzman about 7 years ago

  • Assignee changed from Igor Sfiligoi to Parag Mhashilkar

#5 Updated by Burt Holzman about 7 years ago

  • Priority changed from Normal to Low

#6 Updated by Parag Mhashilkar about 7 years ago

  • Status changed from Feedback to Assigned

#7 Updated by Igor Sfiligoi about 7 years ago

I think essentially 100% of the time when we have a discrepancy, it is a configuration problem.
I.e. I am not aware of any use case where FE admins do on purpose.
Although I can see possible use cases, at least in theory.

However, adding even more cruft to the startd expression will make the negotiaton slower;
which we definitely don't want!

If you still think this is useful, please make sure it can be turned down.
(and I would even argue should be turned down by default, and only enabled if requested)

My 2c,
Igor

#8 Updated by Parag Mhashilkar over 6 years ago

  • Target version changed from v2_7_x to v3_x


Also available in: Atom PDF