Feature #22848

Improve Glidein monitoring and troubleshooting

Added by Marco Mambelli 4 months ago. Updated 16 days ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:


Enhancement of the GlideinWMS's Glideins monitoring. Glideins send back to the Factories information very important to troubleshoot problems and understand the performance of the system. Easy access to this information allows service operators to manage efficiently hundreds of thousands of resources on the Grid and on the Cloud. Secure access avoids disclosing experiments information to unauthorized users.
This activity includes devising the best framework to export and present in a secure and efficient manner the log files and statistics provided by the Glideins. Developing a secure User Interface and a REST API to present the information.

Projects within the scope of this position include participation in the following activities.
  • Evaluation, system integration and development using open source Web frameworks
    • Should only log files be served?
    • What is a fast and efficient storage for the log files? Should they be compressed on the fly?
    • What should happen at client side and what at server side (e.g. uncompression and HTCondor log extraction
    • Who can access? How can it be authenticated?
    • plan a RESTful API
  • Web development related to HTML, CSS, and JavaScript
  • Developments related to distributed computing software for Grids, Clouds and Supercomputers
SOA (State of the Art) survey.
There are already some efforts by Factory operations that should be included/coordinated: ATLAS provides some similar features in its PANDA monitoring (see also attahced files):
tmp190630-ATLAS PanDA Monitoring-logs.pdf (57.1 KB) tmp190630-ATLAS PanDA Monitoring-logs.pdf Marco Mambelli, 07/01/2019 12:14 AM
tmp190630-ATLAS PanDA jobs.pdf (4 MB) tmp190630-ATLAS PanDA jobs.pdf Marco Mambelli, 07/01/2019 12:14 AM
tmp190630-PanDA job 4401159482.pdf (354 KB) tmp190630-PanDA job 4401159482.pdf Marco Mambelli, 07/01/2019 12:14 AM

Related issues

Related to GlideinWMS - Milestone #22673: Summer interns 2019New06/03/201909/30/2019

Blocked by GlideinWMS - Feature #22866: Create dynamic pages serving the Glideins stdout, stderr and included contentNew07/04/2019


#1 Updated by Marco Mambelli 4 months ago

#2 Updated by Marco Mambelli 4 months ago

  • Blocked by Feature #22866: Create dynamic pages serving the Glideins stdout, stderr and included content added

#3 Updated by Marco Mambelli 16 days ago

  • Target version changed from v3_5_x to v3_6_1

Also available in: Atom PDF