Project

General

Profile

Feature #1470

Feature #8593: Improve flexibility of output file handling

Limit output file sizes?

Added by Rob Kutschke over 8 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
I/O
Target version:
Start date:
07/18/2011
Due date:
09/30/2013
% Done:

100%

Estimated time:
40.00 h
Scope:
Internal
Experiment:
-
SSI Package:
art
Duration: 806

Description

Earlier this week one of our people accidentally stuck an extra 0 onto the number of events to do in a grid job. The result was that he blew out the disk space on a few worker nodes, crashing his jobs and some MINERVA jobs ( and possibly other people's). At low priority could the output modules and the TFileService together maintain an approximate running total of the size size of all framework-managed output files and to shutdown gracefully when a given limit is exceeded. My suggestion is that the default limit be set to what is right for Fermigrid; I would allow anyone to change the limit up or down but the default should be to play nice with the grid.

History

#1 Updated by Walter E Brown about 8 years ago

  • Category set to I/O
  • Status changed from New to Accepted
  • Priority changed from Low to Normal

#2 Updated by Christopher Green over 6 years ago

  • Due date set to 09/30/2013
  • Target version set to 1.09.00
  • Estimated time set to 40.00 h
  • Scope set to Internal
  • Experiment - added
  • SSI Package art added

#3 Updated by Christopher Green almost 6 years ago

  • Target version changed from 1.09.00 to 521

#4 Updated by Marc Paterno over 4 years ago

  • Parent task set to #8593

#5 Updated by Christopher Green over 4 years ago

  • Target version changed from 521 to 1.18.00

#6 Updated by Christopher Green over 4 years ago

  • Blocked by deleted (Milestone #896: Design and specify maxEvents for output streams)

#7 Updated by Christopher Backhouse over 4 years ago

I would certainly be very upset if a long-running interactive job suddenly announced that its output had become too big and quit.

I would argue that such things should be handled by the batch system. Why can't it enforce disk quotas in some way? Either by polling or preferably using something like cgroups.

Certainly the option could be provided and enabled by people's grid scripts, but I don't think any kind of limit should be enabled by default.

#8 Updated by Christopher Green about 4 years ago

  • Target version changed from 1.18.00 to 834

#9 Updated by Marc Paterno about 4 years ago

  • Target version changed from 834 to 3.12.06

#10 Updated by Christopher Green about 4 years ago

  • Target version changed from 3.12.06 to 521

#11 Updated by Kyle Knoepfel almost 4 years ago

  • Assignee set to Kyle Knoepfel
  • Target version changed from 521 to 2.01.00

#12 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Accepted to Assigned

#13 Updated by Kyle Knoepfel over 3 years ago

With art:version:"Arcturus", users will be able to switch to a new art/ROOT output file when a specified file-size is exceeded. That feature is a little different from this one, in that here, the request is to end the process when a specified maximum file-size has been reached. However, we'd like to know if the feature as implemented in art:version:"Arcturus" would adequately address the use case you have discussed.

[N.B. Monitoring the file-size of a TFileService output file is not currently part of the art:version:"Arcturus" feature set, and it is not obvious to me right now how (if?) the framework should keep a running tally of the combination of art/ROOT and TFileService output file sizes. However, enabling independent monitoring of TFileService and output-module files is feasible, even if not in time for art:version:"Arcturus".]

#14 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Assigned to Feedback

#15 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Feedback to Resolved
  • % Done changed from 0 to 100

We will close this issue for now, and if the soon-to-be-released file-handling features do not adequately address situations similar to those that were reported, then please submit a new feature request.

#16 Updated by Kyle Knoepfel over 3 years ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF