Bug #24516

Confusing verification script

Added by Marco Mambelli 4 months ago. Updated 11 days ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:
First Occurred:
Occurs In:


Mirica installed a new factory 3.6.2 from scratch.
Condor was not starting and she get this error.
It is at least misleading.

[root@fermicloud044 condor]# systemctl status gwms-factory.service
‚óŹ gwms-factory.service - GWMS Factory Service
   Loaded: loaded (/usr/lib/systemd/system/gwms-factory.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2020-06-04 18:09:21 CDT; 16s ago
  Process: 4898 ExecStart=/usr/sbin/gwms-factory start --check_35_ready (code=exited, status=150)
Jun 04 18:09:21 gwms-factory[4898]: RET = main()  # capital letters used because pylint considers this a constant
Jun 04 18:09:21 gwms-factory[4898]: File "/usr/bin/fact_chown_check", line 52, in main
Jun 04 18:09:21 gwms-factory[4898]: coll_query = htcondor.Collector().locateAll(htcondor.DaemonTypes.Schedd)
Jun 04 18:09:21 gwms-factory[4898]: IOError: Failed communication with collector.
Jun 04 18:09:21 gwms-factory[4898]: The Factory is not ready for 3.5.x. Please run /usr/bin/fact_chown_check --verbo...tails.
Jun 04 18:09:21 systemd[1]: gwms-factory.service: control process exited, code=exited status=150
Jun 04 18:09:21 gwms-factory[4898]: To disable this check remove the --check_35_ready option from the gwms-factory.s...AILED]
Jun 04 18:09:21 systemd[1]: Failed to start GWMS Factory Service.
Jun 04 18:09:21 systemd[1]: Unit gwms-factory.service entered failed state.
Jun 04 18:09:21 systemd[1]: gwms-factory.service failed.
Hint: Some lines were ellipsized, use -l to show in full.
[root@fermicloud044 condor]#

A brand new install should never get this error.
And then fact_chown_check is not documented (I was unable to find the documentation) and when started as root was giving:

Directory /var/log/gwms-factory/client/user_frontend/glidein_gfactory_instance is owned by user with id 43680, while the user running this process is 0
Please, make sure to run the fact_chown script. More details at

And as gfactory:
-bash-4.2$ fact_chown_check
Traceback (most recent call last):
  File "/usr/kerberos/bin/fact_chown_check", line 100, in <module>
    RET = main()  # capital letters used because pylint considers this a constant
  File "/usr/kerberos/bin/fact_chown_check", line 52, in main
    coll_query = htcondor.Collector().locateAll(htcondor.DaemonTypes.Schedd)
IOError: Failed communication with collector.

The stack trace is confusing, a message saying that it needs condor to be running would be more operator friendly


#1 Updated by Marco Mambelli 4 months ago

  • Description updated (diff)

#2 Updated by Marco Mambelli 11 days ago

  • Target version changed from v3_6_4 to v3_6_5

Also available in: Atom PDF