Feature #22347

When we create tarfiles, compress them in place rather than making a tempfile and compressing it

Added by Shreyas Bhat about 2 years ago. Updated about 1 year ago.

JobSub Client
Target version:
Start date:
Due date:
% Done:


Estimated time:


Liang Li from gm2 noted that in the gpvm $TMPDIRs (usually /tmp), they're limited to 2 GB of space. Thus, if they try to create larger tarballs using the jobsub_client tardir:// feature, they can fill up tmp before the tarball is actually created.

This is because in client/, we create a temp file using tempfile.mktemp(), put the contents of the intended tarball in there, compress the tarball using gzip -n, and then move that file into place. Can we skip making the tempfile, and instead write directly to the final tarfile compressed using something like:

tar =, 'w:gz')

Here is the original message from Liang:

        I do have one comment, tardir:// option seems to first tar the
designated directory and then compress it, all of which is done at $TMPDIR
(normally "/tmp"), this is actually a problem for gm2 VM --- for some
reason, /tmp space is limited to merely 2GB for all VMs (Adam and I are
starting another discussion about that). This has caused problems when /tmp
is filled up. Apparently /tmp can be easily filled up when tardir:// option
is used (as I explained above, a tar ball is *first* created and then
*compressed*). Of course, one can simply relocate $TMPDIR to circumvent
that. But I just thought that it might be more convenient (and probably more
efficient) for tardir:// option to act like a "tar cfz" command (which
creates tar ball and compresses it at the same time).


#1 Updated by Shreyas Bhat about 2 years ago

I wonder if we do this because we need to use gzip -n to ensure we ignore timestamps...

#2 Updated by Shreyas Bhat about 2 years ago

Dennis and I discussed how this could be done. We decided to create a new flag called "--tar_output_dir" that users could use in conjunction with the tardir:// URI (to either the -f or the --tar_file_name flags) that would allow users to specify where they wanted the tarballs to be created and then compressed. This option would be passed to the os.temp_file call in the code where we actually create the tarball (create_tar or something like that?)

Poll uboone and gm2 to see if this name/behavior works for them.

#3 Updated by Shreyas Bhat about 2 years ago

  • Assignee changed from Parag Mhashilkar to Shreyas Bhat

#4 Updated by Dennis Box almost 2 years ago

  • Target version set to v1.3.1

#5 Updated by Shreyas Bhat over 1 year ago

Started work on this. I think I have a working model. Waiting for the downtime scheduled for today to be complete before testing on my dev machine.

#6 Updated by Dennis Box about 1 year ago

  • Target version changed from v1.3.1 to v1.3.2

Also available in: Atom PDF