Feature #8593: Improve flexibility of output file handling
Limit output file sizes?
Earlier this week one of our people accidentally stuck an extra 0 onto the number of events to do in a grid job. The result was that he blew out the disk space on a few worker nodes, crashing his jobs and some MINERVA jobs ( and possibly other people's). At low priority could the output modules and the TFileService together maintain an approximate running total of the size size of all framework-managed output files and to shutdown gracefully when a given limit is exceeded. My suggestion is that the default limit be set to what is right for Fermigrid; I would allow anyone to change the limit up or down but the default should be to play nice with the grid.
#7 Updated by Christopher Backhouse over 4 years ago
I would certainly be very upset if a long-running interactive job suddenly announced that its output had become too big and quit.
I would argue that such things should be handled by the batch system. Why can't it enforce disk quotas in some way? Either by polling or preferably using something like cgroups.
Certainly the option could be provided and enabled by people's grid scripts, but I don't think any kind of limit should be enabled by default.
#13 Updated by Kyle Knoepfel over 3 years ago
With art:version:"Arcturus", users will be able to switch to a new
art/ROOT output file when a specified file-size is exceeded. That feature is a little different from this one, in that here, the request is to end the process when a specified maximum file-size has been reached. However, we'd like to know if the feature as implemented in art:version:"Arcturus" would adequately address the use case you have discussed.
[N.B. Monitoring the file-size of a
TFileService output file is not currently part of the art:version:"Arcturus" feature set, and it is not obvious to me right now how (if?) the framework should keep a running tally of the combination of
TFileService output file sizes. However, enabling independent monitoring of
TFileService and output-module files is feasible, even if not in time for art:version:"Arcturus".]
#15 Updated by Kyle Knoepfel over 3 years ago
- Status changed from Feedback to Resolved
- % Done changed from 0 to 100
We will close this issue for now, and if the soon-to-be-released file-handling features do not adequately address situations similar to those that were reported, then please submit a new feature request.