Project

General

Profile

Bug #2864

configuration bug for shared_port

Added by Parag Mhashilkar over 7 years ago. Updated about 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
John Weigand
Category:
-
Target version:
Start date:
08/03/2012
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

Hello gWMS team,

I am currently tasked with setting up a local instance of glideinWMS at the Karlsruhe Institut of Technology with the help of Marian Zvada. When troubleshooting the User Schedd node, we ran into a problem with the shared_port feature for the schedds which could be a potential bug of the installation procedure. Apparently, the default configuration created by the installer for telling the schedds where to find the shared_port_ad is incorrect.

The condor/condor_local/schedd_jobs#/log/MasterLog reveals that the schedds are looking for the shared_port_ad file in the wrong location.
...
07/31/12 23:45:05 SharedPortEndpoint: failed to open /data/srv/condor/7.6.7/condor_local/schedd_jobs1/lock/shared_port_ad: No such file or directory
07/31/12 23:45:05 SharedPortEndpoint: did not successfully find SharedPortServer address. Will retry in 60s.
...

Searching the installation folder revealed the shared_port_ad to be located in condor/condor_local/log/shared_port_ad instead.

We fixed the issue by manually adjusting condor/new_schedd_setup.sh as described for the 2.5.1 version in the advanced condor configuration at

http://www.uscms.org/SoftwareComputing/Grid/WMS/glideinWMS/doc.prd/components/condor.html

i.e. adding the _CONDOR_USE_SHARED_PORT, _CONDOR_SHARED_PORT_DAEMON_AD_FILE and _CONDOR_DAEMON_SOCKET_DIR environment variables.
Note that we are using the 2.5.7 version. The changelog of 2.6 does not reveal any changes related to the shared_port feature using the installer, so the issue does presumably apply to that version as well.

Sincerely,
Max Fischer

History

#1 Updated by Parag Mhashilkar over 7 years ago

  • Assignee set to John Weigand

#2 Updated by Parag Mhashilkar over 7 years ago

  • Target version changed from v2_6_1 to v2_7_x

#3 Updated by John Weigand over 7 years ago

  • Status changed from New to Feedback
  • Assignee changed from John Weigand to Igor Sfiligoi

Branch: branch_v2plus_2864
Commit: 696ff9b

Added these lines to the new_schedd_setup.sh script when shared port
is requested:
#-- condor shared port attributes --
export _CONDOR_USE_SHARED_PORT=True
export _CONDOR_SHARED_PORT_DAEMON_AD_FILE=$LD/log/shared_port_ad
export _CONDOR_DAEMON_SOCKET_DIR=$LD/log/daemon_sock

Should be good now.

John Weigand

#4 Updated by Igor Sfiligoi over 7 years ago

  • Assignee changed from Igor Sfiligoi to John Weigand

Looks reasonable.

But I think it would be even better if we always add the path lines in the file.
This way, the admins can enable the shared_port later on, if so desired.

Comitted: 5ae300d

#5 Updated by John Weigand over 7 years ago

  • Assignee changed from John Weigand to Igor Sfiligoi

Branch: branch_v2plus_2864
Commit: 1778eae

1. Fixed typo in GLOBAL_LOG variable from 'lib' to 'log'.
2. Re-arranged the order of some lines to make them more specific
to shared port use
3. Updated doc and example

Look good to you Igor?
If so, I will mark it as resolved and merge it into v2plus.

John Weigand

#6 Updated by Igor Sfiligoi over 7 years ago

  • Assignee changed from Igor Sfiligoi to John Weigand

Looks good.

Please merge into v2plus.

#7 Updated by John Weigand over 7 years ago

  • Status changed from Feedback to Resolved
  • Target version changed from v2_7_x to v2_6_2

Merged into branch_v2plus

Files affected:
doc/components/condor.html
doc/example-config/multi_schedd/new_schedd_setup.sh
install/glideinWMS_install

John Weigand

#8 Updated by Parag Mhashilkar about 7 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF