Project

General

Profile

Bug #20659

Extracting bzipped tarballs is very slow

Added by Dave Dykstra over 1 year ago. Updated 12 months ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
-
Target version:
Start date:
08/21/2018
Due date:
% Done:

0%

Estimated time:
First Occurred:
Occurs In:
Stakeholders:
Duration:

Description

Jobsub's tar file feature now uses bzip2 encoding. I noticed that extracting the tar files was taking a lot of cpu time on the test cvmfs publishing, and heard that uncompressing bzip2 was about 5 times slower than gzip. Dennis helped me to do a test by creating an approximately 1GB gzipped tar file with exactly the same contents as a formerly used bzip2 tar file, and found that it was 8.5 times faster: it took only 16 seconds to extract instead of 137 seconds. So I suggest switching to use gzip.

Dennis had originally used bzip2 because with gzip the hash changes each time, but the gzip -n option avoids that.

The cvmfs publishing itself is not a very high priority since that is just a test at this point, but this should also save significant time in decompression of these tar files on worker nodes.

History

#1 Updated by Dennis Box over 1 year ago

  • Assignee set to Dennis Box
  • Target version set to v1.2.9

#2 Updated by Dennis Box about 1 year ago

  • Status changed from New to Feedback
  • Assignee changed from Dennis Box to Shreyas Bhat

#3 Updated by Shreyas Bhat about 1 year ago

  • Status changed from Feedback to Accepted

Looks good.

#4 Updated by Shreyas Bhat about 1 year ago

  • Assignee changed from Shreyas Bhat to Dennis Box

#5 Updated by Dennis Box about 1 year ago

  • Status changed from Accepted to Resolved

merged to branch branch_1.2.9_rc0

#6 Updated by Dennis Box 12 months ago

  • Status changed from Resolved to Closed


Also available in: Atom PDF