9

I am fairly new to C# and coding in general so some of this might be going about things the wrong way. The program I wrote works and compresses the file as expected, but if the source is rather large, the program appears (to Windows) to hang. I feel like I should be using a Thread but I am not sure that will help.

I would use a progress bar but the 'new' (.net 4.5) library for zipfile from System.IO.Compression which replaced Ionic.Zip.ZipFile does not have a method to report progress? Is there a way around this? Should I be using a Thread? or DoWork?

The trouble is that the user and the system is not getting feedback on what the program is doing.

I am not sure I am asking the question the right way. Below is the code that is working, but again, will appear to hang the system.

    private void beginBackup_Click(object sender, EventArgs e)
    {
        try
        {
            long timeTicks = DateTime.Now.Ticks;
            string zipName = "bak" + timeTicks + ".zip";
            MessageBox.Show("This Will take a bit, there is no status bar :(");
            ZipFile.CreateFromDirectory(Properties.Settings.Default.source,
                  Properties.Settings.Default.destination + "\\" + zipName);
            MessageBox.Show("Done!");
            this.Close();
        }
        catch (IOException err)
        {
            MessageBox.Show("Something went wrong" + System.Environment.NewLine
                + "IOException source: {0}", err.Source);
        }
    }

The important line being:

        `ZipFile.CreateFromDirectory(Properties.Settings.Default.source,
              Properties.Settings.Default.destination + "\\" + zipName);`

EDIT

ZipFile.CreateFromDirectory()is not walking the directory so there is nothing to increment? it would simply start and finish with no reporting. Unless I am mistaken?

using this method here:

        while (!completed)
    {
        // your code here to do something
        for (int i = 1; i <= 100; i++)
        {
            percentCompletedSoFar = i;
            var t = new Task(() => WriteToProgressFile(i));
            t.Start();
            await t;
            if (progress != null)
            {
                progress.Report(percentCompletedSoFar);
            }
            completed = i == 100;
        }
    }

the code in the for loop would only run once, as the Zipfile woudl still hang the program, then the progress bar would immediately go from 0 to 100?

DSMTurboAWD
  • 344
  • 1
  • 3
  • 16
  • I have a complete example [here](http://stackoverflow.com/questions/41370300/how-can-i-use-two-progressbar-controls-to-display-each-file-download-progress-an/41370508#41370508) that you can follow. – CodingYoshi Feb 24 '17 at 04:11
  • 1
    Possible duplicate of [How can i use two progressBar controls to display each file download progress and also overall progress of all the files download?](http://stackoverflow.com/questions/41370300/how-can-i-use-two-progressbar-controls-to-display-each-file-download-progress-an) – CodingYoshi Feb 24 '17 at 04:12
  • Can the code there be used to report status of `ZipFile.CreateFromDirectory` since it is not 'walking' the directory? so there is nothing to iterate over? – DSMTurboAWD Feb 24 '17 at 04:17
  • Yes it can be used for any kind of progress. It's a pattern I have there. In your case you may want to use a circular bar because you cannot really know how long it will take. Or you can show a progress bar based on the Avg time and change progress every x seconds. Or you can do based on the number of bytes – CodingYoshi Feb 24 '17 at 04:24

3 Answers3

19

I would use a progress bar but the 'new' (.net 4.5) library for zipfile from System.IO.Compression which replaced Ionic.Zip.ZipFile does not have a method to report progress? Is there a way around this? Should I be using a Thread? or DoWork?

You really have two issues here:

  1. The .NET version of the ZipFile class does not include progress reporting.
  2. The CreateFromDirectory() method blocks until the entire archive has been created.

I am not that familiar with the Ionic/DotNetZip library, but browsing the docs, I don't see any asynchronous methods for creating an archive from a directory. So #2 would be an issue regardless. The easiest way to solve it is to run the work in a background thread, e.g. using Task.Run().

As for the #1 issue, I would not characterize the .NET ZipFile class as having replaced the Ionic library. Yes, it's new. But .NET already had .zip archive support in previous versions. Just not a convenience class like ZipFile. And neither the earlier support for .zip archives nor ZipFile provide progress reporting "out-of-the-box". So neither really replace the Ionic DLL per se.

So IMHO, it seems to me that if you were using the Ionic DLL and it worked for you, the best solution is to just keep using it.

If you really don't want to use it, your options are limited. The .NET ZipFile just doesn't do what you want. There are some hacky things you could do, to work around the lack of feature. For writing an archive, you could estimate the compressed size, then monitor the file size as it's being written and compute an estimated progress based on that (i.e. poll the file size in a separate async task, every second or so). For extracting an archive, you could monitor the files being generated, and compute progress that way.

But at the end of the day, that sort of approach is far from ideal.

Another option is to monitor the progress by using the older ZipArchive-based features, writing the archive yourself explicitly and tracking the bytes as they are read from the source file. To do this, you can write a Stream implementation that wraps the real input stream, and which provides progress reporting as the bytes are read.

Here's a simple example of what that Stream might look like (note comment about this being for illustration purposes…it really would be better to delegate all the virtual methods, not just the two you're required to):

Note: in the course of looking for existing questions related to this one, I found one that is essentially a duplicate, except that it's asking for a VB.NET answer instead of C#. It also asked for progress updates while extracting from an archive, in addition to creating one. So I adapted my answer here, for VB.NET, adding the extraction method, and tweaking the implementation a little. I've updated the answer below to incorporate those changes.

StreamWithProgress.cs

class StreamWithProgress : Stream
{
    // NOTE: for illustration purposes. For production code, one would want to
    // override *all* of the virtual methods, delegating to the base _stream object,
    // to ensure performance optimizations in the base _stream object aren't
    // bypassed.

    private readonly Stream _stream;
    private readonly IProgress<int> _readProgress;
    private readonly IProgress<int> _writeProgress;

    public StreamWithProgress(Stream stream, IProgress<int> readProgress, IProgress<int> writeProgress)
    {
        _stream = stream;
        _readProgress = readProgress;
        _writeProgress = writeProgress;
    }

    public override bool CanRead { get { return _stream.CanRead; } }
    public override bool CanSeek {  get { return _stream.CanSeek; } }
    public override bool CanWrite {  get { return _stream.CanWrite; } }
    public override long Length {  get { return _stream.Length; } }
    public override long Position
    {
        get { return _stream.Position; }
        set { _stream.Position = value; }
    }

    public override void Flush() { _stream.Flush(); }
    public override long Seek(long offset, SeekOrigin origin) { return _stream.Seek(offset, origin); }
    public override void SetLength(long value) { _stream.SetLength(value); }

    public override int Read(byte[] buffer, int offset, int count)
    {
        int bytesRead = _stream.Read(buffer, offset, count);

        _readProgress?.Report(bytesRead);
        return bytesRead;
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        _stream.Write(buffer, offset, count);
        _writeProgress?.Report(count);
    }
}

With that in hand, it's relatively simple to handle the archive creation explicitly, using that Stream to monitor the progress:

ZipFileWithProgress.cs

static class ZipFileWithProgress
{
    public static void CreateFromDirectory(string sourceDirectoryName, string destinationArchiveFileName, IProgress<double> progress)
    {
        sourceDirectoryName = Path.GetFullPath(sourceDirectoryName);

        FileInfo[] sourceFiles =
            new DirectoryInfo(sourceDirectoryName).GetFiles("*", SearchOption.AllDirectories);
        double totalBytes = sourceFiles.Sum(f => f.Length);
        long currentBytes = 0;

        using (ZipArchive archive = ZipFile.Open(destinationArchiveFileName, ZipArchiveMode.Create))
        {
            foreach (FileInfo file in sourceFiles)
            {
                // NOTE: naive method to get sub-path from file name, relative to
                // input directory. Production code should be more robust than this.
                // Either use Path class or similar to parse directory separators and
                // reconstruct output file name, or change this entire method to be
                // recursive so that it can follow the sub-directories and include them
                // in the entry name as they are processed.
                string entryName = file.FullName.Substring(sourceDirectoryName.Length + 1);
                ZipArchiveEntry entry = archive.CreateEntry(entryName);

                entry.LastWriteTime = file.LastWriteTime;

                using (Stream inputStream = File.OpenRead(file.FullName))
                using (Stream outputStream = entry.Open())
                {
                    Stream progressStream = new StreamWithProgress(inputStream,
                        new BasicProgress<int>(i =>
                        {
                            currentBytes += i;
                            progress.Report(currentBytes / totalBytes);
                        }), null);

                    progressStream.CopyTo(outputStream);
                }
            }
        }
    }

    public static void ExtractToDirectory(string sourceArchiveFileName, string destinationDirectoryName, IProgress<double> progress)
    {
        using (ZipArchive archive = ZipFile.OpenRead(sourceArchiveFileName))
        {
            double totalBytes = archive.Entries.Sum(e => e.Length);
            long currentBytes = 0;

            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                string fileName = Path.Combine(destinationDirectoryName, entry.FullName);

                Directory.CreateDirectory(Path.GetDirectoryName(fileName));
                using (Stream inputStream = entry.Open())
                using(Stream outputStream = File.OpenWrite(fileName))
                {
                    Stream progressStream = new StreamWithProgress(outputStream, null,
                        new BasicProgress<int>(i =>
                        {
                            currentBytes += i;
                            progress.Report(currentBytes / totalBytes);
                        }));

                    inputStream.CopyTo(progressStream);
                }

                File.SetLastWriteTime(fileName, entry.LastWriteTime.LocalDateTime);
            }
        }
    }
}

Notes:

  • This uses a class called BasicProgress<T> (see below). I tested the code in a console program, and the built-in Progress<T> class will use the thread pool to execute the ProgressChanged event handlers, which in turn can lead to out-of-order progress reports. The BasicProgress<T> simply calls the handler directly, avoiding that issue. In a GUI program using Progress<T>, the execution of the event handlers would be dispatched to the UI thread in order. IMHO, one should still use the synchronous BasicProgress<T> in a library, but the client code for a UI program would be fine using Progress<T> (indeed, that would probably be preferable, since it handles the cross-thread dispatching on your behalf there).
  • This tallies the sum of the file lengths before doing any work. Of course, this incurs a slight start-up cost. For some scenarios, it might be sufficient to just report total bytes processed, and let the client code worry about whether there's a need to do that initial tally or not.

BasicProgress.cs

class BasicProgress<T> : IProgress<T>
{
    private readonly Action<T> _handler;

    public BasicProgress(Action<T> handler)
    {
        _handler = handler;
    }

    void IProgress<T>.Report(T value)
    {
        _handler(value);
    }
}

And of course, a little program to test it all:

Program.cs

class Program
{
    static void Main(string[] args)
    {
        string sourceDirectory = args[0],
            archive = args[1],
            archiveDirectory = Path.GetDirectoryName(Path.GetFullPath(archive)),
            unpackDirectoryName = Guid.NewGuid().ToString();

        File.Delete(archive);
        ZipFileWithProgress.CreateFromDirectory(sourceDirectory, archive,
            new BasicProgress<double>(p => Console.WriteLine($"{p:P2} archiving complete")));

        ZipFileWithProgress.ExtractToDirectory(archive, unpackDirectoryName,
            new BasicProgress<double>(p => Console.WriteLine($"{p:P0} extracting complete")));
    }
}
Peter Duniho
  • 68,759
  • 7
  • 102
  • 136
  • This would be my only worry. I am fine (I think) to use the Ionic zipfile, it does take some coaxing to recognize the library though I am not sure why. But that at some point it would not work. I recall reading something about the cautions of using custom libraries. Though what you have pointed out and written above most certainly helps immensely, thank you. – DSMTurboAWD Feb 24 '17 at 14:19
  • _" it does take some coaxing to recognize the library"_ -- sorry, not sure what you mean by "coaxing". Due to the lack of a true official specification, you may find .zip archives "in the wild" that are readable only by a subset of implementations, sometimes only the implementation that wrote them. But that's an issue even with the .NET implementations. For awhile, the .NET implementation couldn't handle archived data larger than 8GB, and it still can't handle archives where entry names have characters that would be invalid on the Windows file systems. – Peter Duniho Feb 24 '17 at 19:15
  • Custom libraries can be helpful. My main reason for avoiding them is that they usually have less-broad distribution and thus real-world testing than something like .NET, and of course the other reason being, if the framework I _have_ to use already has the functionality I need, adding another dependency is inconvenient and less desirable. But there's nothing inherently wrong with using third-party libraries when necessary. – Peter Duniho Feb 24 '17 at 19:15
  • _Coaxing_ is probably not the most applicable term here, and I apologize as I am still learning a lot of the vernacular to describe problems/issues. For example, when I am using Ionic, there are sub-classes that Visual Studio will complain about saying they are not in the reference, or that I need the .dll, which is easy enough but in this case when doing so, it still would not recognize the subclass ZipFile, even though it was referenced correctly as far as I could tell, which is to say, I am not sure it was Visual Studio being odd, my lack of knowledge, the custom class, or all the above – DSMTurboAWD Feb 24 '17 at 21:03
  • Missing references are always one of three things: the DLL hasn't been added as a reference, you are trying to use an "unqualified" (i.e. with only the type name and not its namespace) type name without the necessary `using` directive at the beginning of your .cs file, or you've done all that but have the wrong DLL version and the type's just not in there. The last one almost never is the problem. So double-check the first two. :) – Peter Duniho Feb 24 '17 at 21:39
  • Notice: if you have folders inside your zip file ,you need to handle them. Enhance the Function inside ZipFileWithProgress.ExtractToDirectory ,like this: `foreach (ZipArchiveEntry entry in archive.Entries) { if (string.IsNullOrEmpty(entry.Name)) { string dirName = Path.Combine(destinationDirectoryName, entry.FullName.TrimEnd('/')); if (Directory.Exists(dirName)) Directory.CreateDirectory(dirName); continue; }` – Markus Doerig Dec 01 '19 at 12:44
4

I think the following is worth sharing, by zipping the files and not the folder, while retaining the relative paths of the files:

    void CompressFolder(string folder, string targetFilename)
    {
        string[] allFilesToZip = Directory.GetFiles(folder, "*.*", System.IO.SearchOption.AllDirectories);

        // You can use the size as the progress total size
        int size = allFilesToZip.Length;

        // You can use the progress to notify the current progress.
        int progress = 0;

        // To have relative paths in the zip.
        string pathToRemove = folder + "\\";

        using (ZipArchive zip = ZipFile.Open(targetFilename, ZipArchiveMode.Create))
        {
            // Go over all files and zip them.
            foreach (var file in allFilesToZip)
            {
                String fileRelativePath = file.Replace(pathToRemove, "");

                // It is not mentioned in MS documentation, but the name can be
                // a relative path with the file name, this will create a zip 
                // with folders and not only with files.
                zip.CreateEntryFromFile(file, fileRelativePath);
                progress++;

                // ---------------------------
                // TBD: Notify about progress.
                // ---------------------------
            }
        }
    }

Notes:

  • You can use FileInfo fileInfo = new FileInfo(file); with fileInfo.Length to progress, using the weight of the files, and not by the amount of files. Sometimes this is more realistic. For that you will also need to accumulate the total folder weight in advance.
  • This solution worked for me.
  • I did not notice any performance degradation between zipping the entire directory and zipping each file in the directory - I did not test this, though.
Juv
  • 744
  • 7
  • 12
0

Coming from this question to implement a progress bar on a HttpClient file download, they implemented it with a extension method on Stream CopyToAsync that can be reused in this situation :

public static class StreamExtensions
{
    public static async Task CopyToAsync(this Stream source, Stream destination, int bufferSize, IProgress<long> progress = null, CancellationToken cancellationToken = default)
    {
        if (source == null)
            throw new ArgumentNullException(nameof(source));
        if (!source.CanRead)
            throw new ArgumentException("Has to be readable", nameof(source));
        if (destination == null)
            throw new ArgumentNullException(nameof(destination));
        if (!destination.CanWrite)
            throw new ArgumentException("Has to be writable", nameof(destination));
        if (bufferSize < 0)
            throw new ArgumentOutOfRangeException(nameof(bufferSize));

        var buffer = new byte[bufferSize];
        long totalBytesRead = 0;
        int bytesRead;
        while ((bytesRead = await source.ReadAsync(buffer, 0, buffer.Length, cancellationToken).ConfigureAwait(false)) != 0)
        {
            await destination.WriteAsync(buffer, 0, bytesRead, cancellationToken).ConfigureAwait(false);
            totalBytesRead += bytesRead;
            progress?.Report(totalBytesRead);
        }
    }
}

public static class ZipHelpers
{
    public static async Task ExtractToDirectoryAsync(string pathZip, string pathDestination, IProgress<float> progress, CancellationToken cancellationToken = default)
    {
        using (ZipArchive archive = ZipFile.OpenRead(pathZip))
        {
            long totalLength = archive.Entries.Sum(entry => entry.Length);
            long currentProgression = 0;
            foreach (ZipArchiveEntry entry in archive.Entries)
            {
                // Check if entry is a folder
                string filePath = Path.Combine(pathDestination, entry.FullName);
                if (entry.FullName.EndsWith('/') || entry.FullName.EndsWith('\\'))
                {
                    Directory.CreateDirectory(filePath);
                    continue;
                }

                // Create folder anyway since a folder may not have an entry
                Directory.CreateDirectory(Path.GetDirectoryName(filePath));
                using (var file = new FileStream(filePath, FileMode.Create, FileAccess.Write, FileShare.None))
                using (var entryStream = entry.Open())
                {
                    var relativeProgress = new Progress<long>(fileProgressBytes => progress.Report((float)(fileProgressBytes + currentProgression) / totalLength));
                    await entryStream.CopyToAsync(file, 81920, relativeProgress, cancellationToken);
                }
                currentProgression += entry.Length;
            }
        }
    }
}

You can also easily modify this to have relative progress for each files by using a struct in the progress :

public struct ZipProgression
{
    public string CurrentFile { get; set; }

    public long CurrentProgression { get; set; }
    public long CurrentTotal { get; set; }
    public long GlobalProgression { get; set; }
    public long GlobalTotal { get; set; }
}

public static async Task ExtractToDirectoryAsync(string pathZip, string pathDestination, IProgress<ZipProgression> progress, CancellationToken cancellationToken = default)
{
    ...
                var relativeProgress = new Progress<long>(fileProgressBytes => progress.Report(new ZipProgression()
                {
                    CurrentFile = entry.FullName,
                    CurrentProgression = fileProgressBytes,
                    CurrentTotal = entry.Length,
                    GlobalProgression = fileProgressBytes + currentProgression,
                    GlobalTotal = totalLength
                }));
    ...
}
Poulpynator
  • 716
  • 5
  • 13