MINOS CONDOR AND GLIDEINWMS ADMINISTRATION¶
- Table of contents
- MINOS CONDOR AND GLIDEINWMS ADMINISTRATION
- HOWTO set priority factors
- Stopping(Draining) the Queues
- Resuming the Queues
- Checking/Clearing Held Glideins
Minos uses a local condor pool for batch slot running on minos50/51/52/53,
and as a gateway to the larger Fermigrid and OSG via glideinWMS.
The CS/SCD/SCS/GCSO FermiGrid group supports these systems.
Minos has limited administrative access for operations.
- Minos25 is the submit machine running condor scheduler (schedd). Minos54 runs glideinWMS 2.5.2 and its accompanying condor daemons to enable batch submission to the grid, usually fermigrid-osg general purpose pool, sometimes fermigrid cdf pool. Both machines are currently at condor 7.4.4. Minos50-53 are attached as a local batch pool running 32 slots.
- here is the crontab for gfactory@minos54
[gfactory@minos54 ~]$ crontab -l MAILTOfirstname.lastname@example.org' @reboot /home/gfactory/start_glideinWMS.sh 55 5 * * * /home/gfactory/refresh_cert
- Minos54: log in as 'gfactory', start_glideinWMS.sh stop_glideinWMS.sh scripts are in the home directory
- the web server on minos54 has to be running as well, sudo /etc/init.d/httpd start if necessary
- Minos25, minos50-53 : log in, sudo /etc/init.d/condor (start or stop)
Stopping(Draining) the Queues¶
- to prevent new jobs from starting do a condor_off -peaceful. Running jobs will continue until they finish, the node will not accept new jobs.
[dbox@minos25 ~]$ . /opt/condor/condor.sh [dbox@minos25 ~]$ sudo /opt/condor/sbin/condor_off -peaceful
- run the stop_factory.sh script as uid gfactory on minos54.
Resuming the Queues¶
- run stop_glideinWMS.sh on minos54 (uid gfactory)
- sudo /etc/init.d/condor stop on minos25 (uid someone in sudoers)
- run start_glideinWMS.sh on minos54 (uid gfactory)
- sudo /etc/init.d/condor start on minos25 (uid someone in sudoers)
Checking/Clearing Held Glideins¶
- Glideins report to their own condor collector (called the WMSCollector) on minos54. A regular condor_q from minos25 will not see them. To use condor tools on the WMSCollector:
- ssh gfactory@minos54
- source working/v2_5_2/wmscollectorcondor/condor.sh
- condor_q, condor_status -any, etc
- this is where you can condor_rm a misbehaving glidein
Minos Batch/Condor requirements per FIFE survey request of May 2013