Support an option for a wall clock time limit for the EmptyEvent source
The HPC resources used by Mu2e use whole node scheduling - that is we get the whole node for the requested time or until we are finished with it, which ever is earlier. On a typical KNL node we run something like 32 processes with 8 threads each or some other variation that adds up to 256 threads.
Our big CPU driver is for stage 1 MC jobs that use the EmptyEvent source. Currently we submit these jobs requesting a fixed number of events in each job. There is a big dispersion of execution times for jobs with a fixed number of events. Suppose we submit jobs that we expect will have a mean duration of 4 hours and a tail to 8 hours. On a typical node the first process might end after 3 hours and the last after 6 hours - so we have just wasted 1/64 of the allocation ( 1 of 32 processes for half of the overall time).
Over an ensemble of jobs, I bet this averages to about 25% of the total available cycles. Chris Jones has told me that CMS is getting pushback from the HPC centers about this.
For any job that uses EmptySource, we can choose a different strategy. We can tell the job to run for a fixed time. For example we might submit jobs with a time limit of 6 hours and tell art to stop processing events after 5:30 or 5:45. We are not charged for the time that is left over after the last process exits so it's not critical to do a detailed optimization of this backoff.
We request that art provide an option on EmptyEvent to tell the job to run until it has used a fixed amount of wall clock time. If it is too expensive to check the elapsed wall clock time after every event, then please provide an option to check the elapsed time every N events. We prefer that it be a configuration error to provide both a maximum wall clock time and a maximum number of events. For jobs that use EmptyEvent this would mean only a modest change in our workflow management and bookkeeping.
At this time we are not interested in this feature for RootInput since that would require a very intrusive change in our workflow management and we don't need that feature at this time. It's possible that we might request this feature in RootInput at a later date - but I hope not.
#5 Updated by Kyle Knoepfel 3 months ago
- % Done changed from 0 to 100
- Assignee set to Kyle Knoepfel
- Status changed from Accepted to Resolved
- Category set to Infrastructure
- SSI Package art added
This feature has been implemented with commit art:13eea62b. An additional parameter called
maxTime has been added to the
EmptyEvent configuration. Per stakeholder discussion, the
maxTime parameter cannot be used with either
maxSubRuns. Attempting to do so will result in a job-ending exception throw.
maxTime value corresponds to the maximum number of seconds the
EmptyEvent source is allowed to construct new events, subruns, or runs. The clock begins at
EmptyEvent construction time, and not the beginning of event-processing--although this difference may not be significant in most cases, it is one to be aware of, especially since the time report at the end of each art job corresponds to the execution of the event loop.
Note also that although new events will not be created after the
maxTime value has been exceeded, the processing of the last event will continue uninhibited. This means that jobs will likely take longer than the
maxTime value specified.