Project

General

Profile

Support #18869

Review Factory and Frontend tools, especially glidien_off and manual_glidein_submit.py

Added by Marco Mambelli almost 2 years ago. Updated about 1 year ago.

Status:
Closed
Priority:
High
Category:
-
Target version:
Start date:
02/01/2018
Due date:
% Done:

0%

Estimated time:
Stakeholders:

CMS, Factory Ops

Duration:

Description

Several tools have not been mantained in years.
They should be tested and if still useful they should be updated if needed.
glidien_off seems not to work correctly and CMS attempted to use it
HTCondor team is interested in using manual_glidein_submit.py
This ticket is for:
1. review and update of glidien_off and manual_glidein_submit.py
2. if possible make a survey/test of the other tools (at least the ones recently requested by operators and the ones easy to test)


Related issues

Related to GlideinWMS - Feature #19946: Factory Operations suggestions summaryNew05/14/2018

History

#1 Updated by Parag Mhashilkar over 1 year ago

  • Assignee set to Marco Mascheroni

#2 Updated by Marco Mambelli over 1 year ago

  • Stakeholders updated (diff)

#3 Updated by Parag Mhashilkar over 1 year ago

  • Stakeholders updated (diff)

#4 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_2_22 to v3_2_23

#5 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_2_23 to v3_4_0

#6 Updated by Marco Mascheroni over 1 year ago

This is a summary of what I found out when discussing which tools are used with factory ops:

  • entry_q, entry_rm and entry_ls are regularly used. The first two have a small customization to make them work without having to cd in the log directory. See the commit associated to the branch
  • All the cat_* commands works, are routinely used by operations, and should be maintained in the future
  • analyze_entries, analyze_frontend, and analyze_queues are commands that generate reports that are sent to frontend admins and factory operators. Now, are these report used? I asked Diego and he actually does not care about the frontend report. The queues report is occasionally used by factory ops to check rundiff problems, while analyze_entries is the main report operators are supposed to check daily to find out if there are problems with sites. There is something not clear related to multicore Jeff bringed up, but we were not able to pinpoint the issue yet. Maybe I should spawn a separate issue? Anyway, I would say we should keep the three commands, and keep well mantained analyze_entries (should we write unit tests?). The other two are not that important, so I'd keep them around but only modify them if asked by the stakeholders.
  • All these commands are not used at all, and should probably be removed: find_ids_not_published (finds all the factory configuration entries that contain a ref id that is not published in the information system), find_matching_ids (finds all the configuration file entries which have ref ids that match an published information systems id but their content differs), find_new_entries (finds all entries for a given information system that are not contained in the configuration), find_Missing_ids (finds all the entries in the Factory configuration file that do not contain an information systems id), and find_partial_matching_ids (finds all configuration file entries that have a partial match to the information system id)
    Infosys_lib.py
    These are (failed) attempts to automatize the creation of entries in the configuration. Turned out that the process of adding an entry is a much more manual process.
  • find_startdLogs (print out the StartdLogs for a certain date) and find_logs (find the logs for a certain date) are broken. At a quick glance the fix seems easy, but I am not sure it is worth mantaining them. Difficult to tell without knowing exactly what they do :)
  • manual_glidein_submit.py Edgar tried to use it but it is broken. IMHO it is worth fixing and mantaining. I can simplify the life during operations.
  • proxy_info Useless script now that factory operators have sudo access. I'd remove it.
  • configGUI: this program implements a GUI for the configuration of the glideinWMS factory (author Robert Chen). I did not try it, but I'd just remove it.
  • The 2to3 conversion scripts are hopefully deprecated and can be removed
  • gwms-logcat.sh: a wrapper on top of cat_* commands, not used by factory ops but maybe useful?
  • A series ot tools in ./tools (not ./factory/tools) looks promising but I did not have a chance to test them yet (could not firgure out how they works/they are broken)
  • There are some tools used by factory ops that maybe can be moved upstream (like Gget_wns, a tool that can be concatenated to cat_XMLResult to get the list of worker nodes affected, for example, by validation error)

I'd like to remove the tools that are not used, is it ok for everybody?

#7 Updated by Marco Mambelli over 1 year ago

  • Target version changed from v3_4_0 to v3_4_1

#8 Updated by Marco Mambelli over 1 year ago

  • Priority changed from Normal to High

#9 Updated by Marco Mascheroni over 1 year ago

  • Status changed from New to Feedback
  • Assignee changed from Marco Mascheroni to Marco Mambelli

#10 Updated by Marco Mambelli over 1 year ago

  • Assignee changed from Marco Mambelli to Marco Mascheroni

#11 Updated by Marco Mascheroni over 1 year ago

  • Aligned entry_* versions to the ones used in production factories
  • Removed: configGUI.py, convert_factory_2to3.sh, convert_factory_2to3.xslt, convert_factory_rrds_2to3.sh, find_ids_not_published.py, find_matching_ids.py, find_missing_ids.py, find_new_entries.py, find_partial_matching_ids.py, infosys_lib.py, proxy_info, convert_frontend_2to3.sh, convert_frontend_2to3.xslt
  • Completely rewritten manual_glidein_submit.py (does not use .ini file anymore but takes necessary information from glideclient classad)

#12 Updated by Marco Mascheroni over 1 year ago

  • Status changed from Feedback to Resolved

#13 Updated by Marco Mascheroni over 1 year ago

  • Related to Feature #19946: Factory Operations suggestions summary added

#14 Updated by Marco Mambelli about 1 year ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF