Feature #21741

Improve monitoring stats in glidefactory and glidefactorystatus classads

Added by Marco Mambelli 7 months ago. Updated 4 days ago.

Target version:
Start date:
Due date:
% Done:


Estimated time:


glidefactory and glidefactorystatus ClassAds contain monitoring information coming from the Factory startd (condor_q query).
This is stored as:
GlideinMonitorStatus... (in GFC for the specific client)
GlideinMonitorTotalStatus... (in GF, summary for the entry)

When there is interaction w/ clients the partial stats are calculated via subquery and the total is calculated doing the sum.
When there is no interaction w/clients the total is calculated from the list of jobs returned by condor_q for the entry (condorQ), see [#21525]

It may be convenient to calculate all the partial and total stats running once through the list of all the glidiens in the entry, doing all the calculations once.
This way all the monitoring info will be fresh and evaluated the same way.
Furthermore, the current method may leave some stale info if only some clients are interacting w/ one entry.

Some considerations before implementing:
- consider if the client name is all in the job (glidein) classad, without the need to check glidefactoryclient classads
- consider if the information is used within the same process
- evaluate the use of parallel workers
- think about the memory footprint
- do a benchmark to compare performance:
- trigger 1000 or more glideins, store the list of classads (will be useful also for unittests)
- calculate the stats w/ subqueries + total
- claculate all the stats in the new way
- compare memory usage and time
- evaluate the checks on the client names
- pay attention to the 2 stats dictionaries: client_stats (w/ client_int_name) and qc_stats (w/ client_log_name)

Related issues

Related to glideinWMS - Feature #22163: Check if there are load changes in Factory and solve TODOs added in #21880New2019-03-19


#1 Updated by Marco Mambelli 5 months ago

Check changes done in [#21880] and scheduled for [#21741]

#2 Updated by Marco Mambelli 5 months ago

  • Related to Feature #22163: Check if there are load changes in Factory and solve TODOs added in #21880 added

#3 Updated by Marco Mambelli 4 months ago

  • Target version changed from v3_5 to v3_5_1

#4 Updated by Marco Mambelli 4 days ago

  • Target version changed from v3_5_1 to v3_5_2

Also available in: Atom PDF