Blackhole detection counts all failed jobs, not recently failed jobs...
Adding a recently_failed list, and code which checks the end date of the job
to decide if the job failed in the last hour. Then the length of that list
is used to decide if there is a job black hole.
That as opposed to jobs that failed a while ago, but we're only just now finding
out about them, which is what happens when SAM failed last night and things are
getting running again...