1

Good day, I've created my own custom Wizard Installer for my website project. My goal is to minimize the work during the installation of our client.

I'm trying to extract a 7z file that has millions of tiny files (200-bit size of each file) inside. I'm using sharpcompress to achieve this extracting process but it seems that it will take hours to finish the task which is very bad for the user.

I don't care about compression. What I need is to reduce the time of the extracting process of these millions of tiny files or if possible, to speed up the extraction.

My question is. What is the fastest way to extract millions of tiny files? or any method to pack and unpack the files with the highest speed of unpacking.

I'm trying to extract the 7z file by this code:

using (SevenZipArchive zipArchive = SevenZipArchive.Open(source7z))
{
    zipArchive.WriteToDirectory(destination7z,
        new ExtractionOptions { Overwrite = true, ExtractFullPath = true });
}

But seems the extracting time is very slow for tiny files.

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
Noryn Basaya
  • 664
  • 1
  • 5
  • 21
  • how much time does it take, if you use C# code to write those files(approximately size and amount)? – Lei Yang Feb 17 '22 at 01:28
  • 5
    Millions of files will take a long time to be created no matter what compression forget you are using. The filesystem overhead of creating the metadata will take longer than actually writing the data to the disk. ([Similar discussion](https://arstechnica.com/civis/viewtopic.php?f=15&t=37212)) – Moshe Katz Feb 17 '22 at 01:28
  • @LeiYang I tested it. the tiny files have 1GB+ in total and it takes 1 hour and a half to extract. The average speed of extraction is about 100kb/s which is very bad. – Noryn Basaya Feb 17 '22 at 01:36
  • 1
    i doubt why you need so many small files. sounds like database, so why not use a real database, which could be more efficient? and if the file contents are auto-generated, i'd rather generate them at runtime. – Lei Yang Feb 17 '22 at 01:40
  • @MosheKatz thanks for such information. I have compared the extracting time between winrar and the code above using sharpcompress. Using sharpcompress take 1 hour to finish as what I've mention above but but when I try to extract using winrar application, It will take 20 minutes only. mmhhh. – Noryn Basaya Feb 17 '22 at 01:49
  • @LeiYang nope. it's not a database. It's generated by moodle framework so I don't have any control of it. – Noryn Basaya Feb 17 '22 at 01:51
  • 1
    @MosheKatz winrar might be using c/c++, and not strange 3 times faster than .net. – Lei Yang Feb 17 '22 at 02:02
  • @MosheKatz I see. Thanks for that. – Noryn Basaya Feb 17 '22 at 02:03
  • An option is to just include the *"standalone console version"* available from the 7zip website, and simply invoke the .exe – NPras Feb 17 '22 at 05:41
  • @NPras that won't help much because in this case you're IO bound and not CPU bound – phuclv Feb 17 '22 at 06:52
  • @phuclv that may not be the case, based on their comment that WinRAR did it 3x faster on the same computer. – NPras Feb 17 '22 at 21:53

0 Answers0