analyze_entries reporting on stale data
Bug report from Daniel Klein:
Hi glideinWMS team,
I've discovered what appears to be a bug in the analyze_entries tool. The gist is that analyze_entries continues to report validation failures for an entry that was put in downtime. The full narrative goes as follows:
A particular site was consistently giving us 100% validation failures, and the site admins told us the problem wouldn't be fixed for a long while, so I put the site in downtime with no end date. I've verified the site is in downtime 4 different ways:
1) It appears in the glideinWMS.downtimes file.
2) When I do: factory_startup statusdown -entry OSG_US_Buffalo_u2-grid, it reads as "down".
3) The web monitoring for the site hasn't updated since 4/16 (the date I put it in downtime), and
4) In the factory directory for this entry, no file has been updated since 4/16.
However, this entry continues to appear in the daily analyze_entries summary email we receive from the factory. Running analyze_entries manually on the factory, indeed we see:
$ analyze_entries -x 24 -s waste
OSG_US_Buffalo_u2-grid 100% 100% 100% | 100% 0% 100% 100% | 48 48 | 144
This exact same line has appeared in the analyze_entries output since the downtime date. By "exact same" I mean 100% identical - the numbers displayed for this entry haven't changed since I put it down. We suspect that analyze_entries is simply reading the 24 most recent hours of logs that are available, so it continues to report the state of this entry for the 24 hours prior to my putting it in downtime.