Factory monitoring to display sites with issues
We need easy way to identify the sites that have held glideins. Currently, a single site have multiple entries in the factory config. We need a mechanism to aggregate all the held glideins associated with an entry and if the number is above certain threshold, the site should be flagged in the monitoring.
#2 Updated by Igor Sfiligoi over 7 years ago
I guess this related to one of my requests... so let me correct it, as it is not what I meant.The above description mixes two problems:
- Aggregation of information from multiple entries going into a single (logical) site
And for this I would like to have the full blown "standard monitoring" web pages
- The impossibility of currently monitoring the scale of problems we have in the factory;
for this one I would like to have a counter that tells me how many of the entries are problematic/broken
and the entry hitting the held limit is just one of the conditions I would like to have monitored this way
I leave it up to Parag to decide how to proparly split the issue into two separate tickets.