Project

General

Profile

Support #21939

glidein_off problems in FIFE

Added by Marco Mambelli 7 months ago. Updated 20 days ago.

Status:
New
Priority:
High
Category:
-
Target version:
Start date:
02/20/2019
Due date:
% Done:

0%

Estimated time:
Stakeholders:
Duration:

Description

Glidein_off seems not to work correctly:

Following up on yours and Joe's conversation, I tried to use glidein_off recently to kill a glidein at Colorado from the FIFE FE machine, gpfrontend01.  The first couple of error messages helped me add necessary arguments, but the one I'm getting now is not really telling me what I'll need to do to resolve it (or if there's some other issue).  Here is my command, and the resultant message:

[rexbatch@gpfrontend01 ~]$ glidein_off -n slot1@glidein_30707_429132676@lnxfarm321.colorado.edu -d /var/lib/gwms-fr ontend/vofrontend -g fermilab

ERROR: Exception msg Error executing htcondor query with constraint (GLIDECLIENT_Name=?="gpfrontend01-fnal-gov_gWMS
Frontend.fermilab")&&((Name=?="slot1@glidein_30707_429132676@lnxfarm321.colorado.edu")) and format_list [('GLIDEIN_
MASTER_NAME', 's'), ('GLIDEIN_COLLECTOR_NAME', 's'), ('Name', 's')]: Error querying pool default using python bindi
ngs: Failed communication with collector.

So it looks like some collector issue?  

Thanks,
Shreyas

FIFE gave access to the gpfrontend01.fnal.gov host to Marco Marcheroni
There is a Servicenow ticket:
https://fermi.service-now.com/nav_to.do?uri=%2Fsc_req_item.do%3Fsys_id%3Dadb11dd7dbb26b480bc630dc7c9619c2%26sysparm_stack%3Dsc_req_item_list.do%3Fsysparm_query%3Dactive%3Dtrue

It is important at least to follow up on the support ticket, understand what happened and provide FIFE some feedback.
The GlideinWMS work can then be scheduled at a lower priority if the impact is limited.

History

#1 Updated by Marco Mascheroni 4 months ago

  • Target version changed from v3_5 to v3_5_1

#2 Updated by Marco Mascheroni 20 days ago

  • Target version changed from v3_5_1 to v3_5_2


Also available in: Atom PDF