6

I've got many files that I want to store in a single archive file. My first approach was to store the files in a gzipped tarball. The problem is, that I've to rewrite the whole archive if a single file is added. I could get rid of the gzip compression, but adding a file would still be expensive.

What other archive format would you suggest that allows fast append operations?

Benedikt Waldvogel
  • 12,406
  • 8
  • 49
  • 61
  • Are you able to write a container yourself, or do you need a well known algorithm to enable other people/systems to handle the result file? – k_b Jun 07 '10 at 23:34
  • 1
    I try avoid writing my own container. One reason is that people should be able to open the file, yes. I would also suppose that writing my own container causes more work and is initially lot more buggy. – Benedikt Waldvogel Jun 08 '10 at 08:26

2 Answers2

2

The ZIP file format was designed to allow appends without a total re-write and is ubiquitous, even on Unix.

msw
  • 42,753
  • 9
  • 87
  • 112
  • 1
    The question http://stackoverflow.com/questions/2223434/appending-files-to-a-zip-file-with-java is highly related. I'm not sure if there's any (Java) implementation that allows appends without total rewrite. – Benedikt Waldvogel Jun 08 '10 at 09:16
2

ZIP and TAR fomats (and the old AR format) allow file append without a full rewrite. However:

  • The Java archive classes DO NOT support this mode of operation.
  • File append is likely to result in multiple copies of a file in the archive if you append an existing file.
  • The ZIP and AR formats have a directory that needs to be rewritten following a file append operation. The standard utilities take precautions when rewriting the directory, but it is possible in theory that you may end up with an archive with a missing or corrupted directory if the append fails.
Stephen C
  • 698,415
  • 94
  • 811
  • 1,216