Project

General

Profile

Bug #2631

Frontend in downtime affecting other frontends?

Added by Burt Holzman almost 8 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Douglas Strain
Category:
-
Target version:
Start date:
04/05/2012
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

Igor reports:

I think we just hit a new factory bug!
One of the FEs for this entry is requesting glideins, but getting none!

In particular
entry: CMS_T3_US_Omaha_tusker
fe: EngageVO-submit3-v1_0.main

and later:

We found the culprit;
we had NEBio frontend in downtime, but for whatever reason, the factory stopped submitting for Engage as well!

We removed the downtime for NEBiogrid, and thing seem to work now (at least for Engage).

So, we have a workaround for now, but this needs to be fixed.

igor.txt (23.4 KB) igor.txt Igor's description of problem Burt Holzman, 04/05/2012 03:25 PM

Related issues

Related to GlideinWMS - Bug #3614: Fix for Bug 2631 broke v2 protocol in v3 frontendClosed03/22/2013

History

#1 Updated by Douglas Strain over 7 years ago

  • Status changed from Assigned to Feedback
  • Assignee changed from Douglas Strain to Parag Mhashilkar

Currently, the code sets the entry into downtime once it finds a security class / frontend into
downtime. Then, later frontends will not be able to send glideins (reqidle is set to zero somewhere)

This exactly matches Igor's description of the problem.

I have corrected this issue by not setting the entry into downtime and instead doing a "continue" as if it was a bad proxy. I think this is the proper solution, but maybe we should also hook this up with the new code to provide feedback to the frontend (add a new ticket for that? not sure if its been merged?).

I have sent to Parag to review this. Krista is also aware of the code as she re-wrote a bunch of broken things a while back.

#2 Updated by Parag Mhashilkar over 7 years ago

  • Status changed from Feedback to Assigned
  • Assignee changed from Parag Mhashilkar to Douglas Strain

Commented on the code changes separately.

#3 Updated by Parag Mhashilkar over 7 years ago

  • Target version changed from v2_7_x to v2_6_2

#4 Updated by Douglas Strain over 7 years ago

  • Status changed from Assigned to Resolved

Got rid of a redundant if statement and merged into branch_v2plus master

#5 Updated by Parag Mhashilkar about 7 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF