Bug #5241
Frontend credential selection plugin ProxyUserMapWRecycling seems to be broken
0%
Description
On Jan 22, 2014, at 2:34 PM, Igor Sfiligoi wrote:
BTW: We were using ProxyUserMapWRecycling.
I switched to ProxyAll now, and things seem to work as expected there.
So this is less urgent that I originally though it was.
But the bug is real, and should be addressed.
Let me know i I should create a ticket for it, or if you are doing it.
Thanks,
Igor
On 01/22/2014 12:06 PM, Igor Sfiligoi wrote:
Hi Parag and Burt.
I think we have significant problems with the v3 FE for CMS AnaOps.
Looks like it is setting max_run to a ridiculously low number.
I think the bug is in
git blame glideinFrontendPlugins.py
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 394) # Out of the max_run glideins,
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 395) # Allocate proportionally out of the total jobs
3da1255c (Parag Mhashilkar 2013-05-30 14:23:22 -0500 396) if (params_obj is not None):
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 397) this_max=self.num_user_jobs[user]*params_obj.max_run_glideins/self.total_jobs
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 398) this_idle=self.num_user_jobs[user]*params_obj.min_nr_glideins/self.total_jobs
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 399) if (this_max<=0):
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 400) this_max=1
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 401) if (this_idle<=0):
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 402) this_idle=1
d5fb79af (Doug Strain 2012-01-11 17:08:44 -0600 403) cel['proxy'].add_usage_details(this_idle,this_max)
I.e. since CMS has a large number of users but only 10 pilot pilots, the above logic does not make sense.
Can you please have a quick look at it and either confirm or tell me I am completely off?
I have never had a look at that code before.
Thanks,
Igor
-------- Original Message --------
Subject: Re: [Cms-wms-support] Upgrading the AnaOps FE to v3
Date: Wed, 22 Jan 2014 11:50:53 -0800
From: Igor Sfiligoi <sfiligoi@fnal.gov>
To: Jeff Dost <jdost@ucsd.edu>
Uhm... looks like there are two values now:
[1148] frontend@glidein-frontend ~$ condor_status -any 'CMS_T2_US_Florida_iogw1@v3_0@SDSC@UCSD-v6_0.main' -l |grep -i request |grep -v Expr
GlideClientMonitorGlideinsRequestIdle = 20
GlideClientMonitorGlideinsRequestMaxRun = 744
GlideFactoryMonitorRequestedIdle = 10
GlideFactoryMonitorRequestedMaxGlideins = 90
And, one makes sense, and the other doesn't!
Digging deeper now.
Igor
Here is the FE log
[2014-01-22 11:42:49,337] INFO: Jobs in schedd queues | Glideins | Request
[2014-01-22 11:42:49,338] INFO: Idle (match eff old uniq ) Run ( here max ) | Total Idle Run | Idle MaxRun Down Factory
[2014-01-22 11:42:49,418] INFO: 634(16799 632 627 0) 4533( 74 10000) | 76 2 74 | 20 744 Up CMS_T2_US_Florida_iogw1@v3_0@SDSC@gfactory-1.t2.ucsd.edu
On 01/22/2014 11:39 AM, Jeff Dost wrote:
Ok, I found more info,
I see the following in the factory info log:
[2014-01-22 11:28:18,705] INFO: Additional idle glideins not needed, have met request max_glideins limits 4, not submitting
And look here, after upgrade, Max requested, the red line, dropped from roughly #running, all the way down to 4!
Igor, correct me if I am wrong, but if I understand correctly, the frontend should be generating this "max requested" value, so I believe it is a problem on the frontend side.
Jeff
History
#1 Updated by Marco Mambelli almost 6 years ago
- Assignee changed from Parag Mhashilkar to Marco Mambelli
#2 Updated by Burt Holzman almost 6 years ago
- Priority changed from Normal to Low
#3 Updated by Parag Mhashilkar over 5 years ago
- Target version changed from v3_2_4 to v3_2_5
#4 Updated by Parag Mhashilkar over 5 years ago
- Target version changed from v3_2_5 to v3_2_6
#5 Updated by Burt Holzman over 5 years ago
- Target version changed from v3_2_6 to v3_2_x
#6 Updated by Parag Mhashilkar almost 5 years ago
- Target version changed from v3_2_x to v3_2_9
#7 Updated by Parag Mhashilkar almost 5 years ago
- Stakeholders updated (diff)
#8 Updated by Brian Bockelman almost 5 years ago
- Stakeholders updated (diff)
I'm not sure why this is marked as CMS & OSG.
OSG has never used this plugin. CMS quit using this plugin a few months ago.
Removing the OSG stakeholder.
#9 Updated by Parag Mhashilkar almost 5 years ago
- Target version changed from v3_2_9 to v3_2_x
#10 Updated by Parag Mhashilkar about 4 years ago
- Target version changed from v3_2_x to v3_2_13
#11 Updated by Parag Mhashilkar almost 4 years ago
- Target version changed from v3_2_13 to v3_2_14
#12 Updated by Parag Mhashilkar over 3 years ago
- Target version changed from v3_2_14 to v3_2_15
#13 Updated by Parag Mhashilkar over 3 years ago
- Target version changed from v3_2_15 to v3_2_16
#14 Updated by Parag Mhashilkar over 3 years ago
- Target version changed from v3_2_16 to v3_x
#15 Updated by Parag Mhashilkar over 3 years ago
- Target version changed from v3_x to v3_2_x
#16 Updated by Marco Mambelli over 1 year ago
- Target version changed from v3_2_x to v3_4_x
#17 Updated by Marco Mambelli over 1 year ago
- Target version changed from v3_4_x to v3_5_x
#18 Updated by Marco Mambelli 2 months ago
- Target version changed from v3_5_x to v3_6_x