Project

General

Profile

Bug #25000

HTCondor 8.8.10 uses shared_port_port

Added by Marco Mambelli about 1 month ago. Updated 21 days ago.

Status:
Closed
Priority:
Normal
Category:
-
Target version:
Start date:
09/22/2020
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

Due to a bug in previous versions, HTCondor was ignoring shared_port_port.
Now it is using it and it is defaulting to collector_port.
This is causing some inconsistency in the Factory configuration even if all seems to work.

# condor_config_val -dump | grep -i shared
AUTO_INCLUDE_SHARED_PORT_IN_DAEMON_LIST = true
COLLECTOR.USE_SHARED_PORT = False
COLLECTOR_USES_SHARED_PORT = False
DAEMON_LIST = MASTER,  COLLECTOR, NEGOTIATOR, SCHEDD, SHARED_PORT, SCHEDDGLIDEINS2, SCHEDDGLIDEINS3, SCHEDDGLIDEINS4, SCHEDDGLIDEINS5
DAGMAN_USE_SHARED_PORT = false
HAD_USE_SHARED_PORT = false
MASTER.USE_SHARED_PORT = true
MAX_SHARED_PORT_LOG = $(MAX_DEFAULT_LOG)
REPLICATION_USE_SHARED_PORT = $(HAD_USE_SHARED_PORT)
SCHEDD.USE_SHARED_PORT = true
SHADOW.USE_SHARED_PORT = true
SHARED_PORT = $(LIBEXEC)/condor_shared_port
SHARED_PORT_ARGS = -p 9615
SHARED_PORT_DAEMON_AD_FILE = $(LOG)/shared_port_ad
SHARED_PORT_DEBUG =
SHARED_PORT_DEFAULT_ID =
SHARED_PORT_LOG = $(LOG)/SharedPortLog
SHARED_PORT_MAX_FILE_DESCRIPTORS = 4096
SHARED_PORT_MAX_WORKERS = 1000
SHARED_PORT_PORT = $(COLLECTOR_PORT)
USE_SHARED_PORT = true
# grep -i shared /etc/condor/config.d/*
/etc/condor/config.d/00_gwms_factory_general.config:#--  With glideins, there is nothing shared
/etc/condor/config.d/00_gwms_factory_general.config:#-- LOCK will be redefined by secondary daemons, so shared files must refer to log
/etc/condor/config.d/00_gwms_factory_general.config:SHARED_PORT_DAEMON_AD_FILE = $(LOG)/shared_port_ad
/etc/condor/config.d/01_gwms_factory_collectors.config:#-- Collectors are behind shared port starting in HTCondor 8.4
/etc/condor/config.d/01_gwms_factory_collectors.config:# Disable the use of shared port by collector
/etc/condor/config.d/01_gwms_factory_collectors.config:COLLECTOR_USES_SHARED_PORT=False
/etc/condor/config.d/01_gwms_factory_collectors.config:# In HTCondor 8.6 this seems to be needed as well (otherwise the collector uses shared port)
/etc/condor/config.d/01_gwms_factory_collectors.config:COLLECTOR.USE_SHARED_PORT=False
/etc/condor/config.d/02_gwms_factory_schedds.config:#--  Enable shared_port_daemon
/etc/condor/config.d/02_gwms_factory_schedds.config:MASTER.USE_SHARED_PORT = True
/etc/condor/config.d/02_gwms_factory_schedds.config:SHADOW.USE_SHARED_PORT = True
/etc/condor/config.d/02_gwms_factory_schedds.config:SCHEDD.USE_SHARED_PORT = True
/etc/condor/config.d/02_gwms_factory_schedds.config:SHARED_PORT_MAX_WORKERS = 1000
/etc/condor/config.d/02_gwms_factory_schedds.config:SHARED_PORT_ARGS = -p 9615
/etc/condor/config.d/02_gwms_factory_schedds.config:DAEMON_LIST = $(DAEMON_LIST), SHARED_PORT

And from /var/log/condor/SharedPortLog:

09/21/20 18:01:08    /etc/condor/condor_config.local
09/21/20 18:01:08 config Macros = 178, Sorted = 178, StringBytes = 7776, TablesBytes = 6512
09/21/20 18:01:08 CLASSAD_CACHING is ENABLED
09/21/20 18:01:08 Daemon Log is logging: D_ALWAYS D_ERROR
09/21/20 18:01:08 Daemoncore: Listening at <0.0.0.0:9615> on TCP (ReliSock).
09/21/20 18:01:08 DaemonCore: command socket at <131.225.154.184:9615?addrs=131.225.154.184-9615&noUDP>
09/21/20 18:01:08 DaemonCore: private command socket at <131.225.154.184:9615?addrs=131.225.154.184-9615>
09/21/20 18:01:08 main_init() called
09/21/20 18:01:08 About to update statistics in shared_port daemon ad file at /var/log/condor/shared_port_ad :

As visible from the snippets, shared_port daemon starts on 9615 as specified in -p, but anyone querying condor_config_val will get the wrong port, 9618.

This affects any version running w/ HTCondor 8.8.10 or later

History

#1 Updated by Marco Mambelli about 1 month ago

  • Assignee set to Marco Mambelli

#2 Updated by Marco Mambelli about 1 month ago

  • Assignee changed from Marco Mambelli to Marco Mascheroni
  • Status changed from New to Feedback

changes in v36/25000
Opened a ticket to consider moving the Factory collector to shared port. [#25001]

#3 Updated by Marco Mascheroni 22 days ago

  • Status changed from Feedback to Accepted

#4 Updated by Marco Mascheroni 22 days ago

  • Assignee changed from Marco Mascheroni to Marco Mambelli

#5 Updated by Marco Mambelli 22 days ago

  • Status changed from Accepted to Resolved

#6 Updated by Marco Mambelli 21 days ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF