1

I was wondering if it was possible to extract multiple files simultaneously from a TZipFile. I'm using Delphi 11.

I've had a bit of a play around with no luck. I was thinking something along the lines of, which doesn't work.

var
  z : TZipFile;
begin
  z := TZipFile.Create;
  z.Open('e:\temp\temp.zip', TZipMode.zmRead);
  TParallel.For(1, 0, z.FileCount-1,
    procedure(i : integer)
    begin
      z.Extract(i, 'e:\temp\Unzip');
    end
  );
  z.Free;
end;

Update: I made this short video on multi-threaded extraction https://youtu.be/wa7i1bmYgq4. Here is some demo code for testing purposes.

const
  ZipTo = 'E:\Temp\ZipTest';
  ZipFile = 'e:\temp\temp.zip';

procedure TForm32.Button3Click(Sender: TObject);
begin
  var z := TZipFile.Create;
  z.Open(ZipFile, TZipMode.zmRead);
  var FileCount := z.FileCount;
  z.Free;

  var sw := TStopwatch.StartNew;
  TParallel.For(1, 0, FileCount,
    procedure (i : integer)
    begin
      var z := TZipFile.Create;
      z.Open(ZipFile, TZipMode.zmRead);
      z.Extract(i, ZipTo);
      z.Free;
    end
  );
  ShowMessage(sw.ElapsedMilliseconds.ToString);
end;

procedure TForm32.Button4Click(Sender: TObject);
begin
  var sw := TStopwatch.StartNew;
  TZipFile.ExtractZipFile(ZipFile, ZipTo);
  ShowMessage(sw.ElapsedMilliseconds.ToString);
end;

Single-threaded: 3,783ms

Multithreaded: 1,591ms (approx 2.4x improvement)

You can clear the disk cache using CacheSet from SysInternals. It seems to make a difference of about 100ms on my machine. The zip file that I used was about 1GB in size and contained 6 video files. This means that I was using six threads (as it tries to extract every file at once), it looks like about four would be optimal for my machine, but it's going to depend greatly on disk speed and the type of content of the zip file.

Alister
  • 6,527
  • 4
  • 46
  • 70
  • 1
    Look at [this QA](https://stackoverflow.com/q/1632470/2292722) for inspiration. – Tom Brunberg Aug 24 '22 at 06:16
  • I don't see the reason why it would not be possible to do this since each file in ZIP archive is compressed separately. But I'm afraid your probably won't be able to do this by using Zlib libraries that ships with Delphi. – SilverWarior Aug 24 '22 at 06:33
  • What would be the benefit of this? Your program is disk bound. – David Heffernan Aug 24 '22 at 07:54
  • 1
    Mechanically this only makes sense when the ZIP file is either fully cached or the storage is not a magnetic disk (which cannot access different portions in parallel). Technically the ZIP file can have various folders and the destination path must exist before extracting one file - that could lead to race conditions when trying to create folders. – AmigoJack Aug 24 '22 at 08:02
  • 2
    This would probably work, if you use multiple instances of TZip. But I'm not sure whether that will give you any performance gains. You might want to have a look at the 7zip support using the JEDI jcl instead, since this has built in multithreading support. – dummzeuch Aug 24 '22 at 08:14
  • 1
    I'm pretty sure you need a different instance of the zip file, as it's internals will be being used to do the extract. There should be no problem with multiple TZipFiles reading the same source file but I haven't checked the open modes in the code. – Rob Lambden Aug 24 '22 at 08:35
  • 1
    https://www.youtube.com/watch?v=wa7i1bmYgq4 ;-) – Delphi Coder Aug 29 '22 at 06:21
  • @DavidHeffernan, I got about a 2.5x speed up by using multiple threads, so mostly I/O bound. Probably with a SATA drive it would be fully be limited by the disk. – Alister Aug 30 '22 at 07:11
  • @Alister Was the ZIP file flushed from the disk cache? – David Heffernan Aug 30 '22 at 13:23
  • @DavidHeffernan The cache is probably effecting things. The zip file and content were just over 1GB (each), and the drive can handle 7GB/s, so two seconds to extract is not unreasonable with multiple threads, vs five seconds for a single thread. – Alister Sep 03 '22 at 19:13

1 Answers1

2

I was wondering if it was possible to extract multiple files simultaneously from a TZipFile.

No. TZipFile does not support parallel extraction of the compressed files.

Most notable issue that prevents parallel extraction comes from FStream and FFileStream fields. Delphi streams don't support parallel access and even if they would support it, you wouldn't get any speed up in extracting operations, because any kind of operation in progress on stream would need to block other operations until the current one is complete.

Streams are state-full instances and every operation on a stream changes internal stream state (current stream position). Even read operations change the state and that is what makes streams thread-unsafe and that unsafety impacts TZipFile class as well.

Of course, there may be other sources of thread unsafety in the TZipFile class, but once you have one thing that is not safe and cannot be easily changed (fixed), there is no point in searching for other potential issues.

Dalija Prasnikar
  • 27,212
  • 44
  • 82
  • 159