118

Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file?

I possibly want to extract data (file) from the start of a zip file if the compression algorithm compress the file used was in a deterministic order.

ΩmegaMan
  • 29,542
  • 12
  • 100
  • 122
AwkwardCoder
  • 24,893
  • 27
  • 82
  • 152

7 Answers7

147

With .Net Framework 4.5 (using ZipArchive):

using (ZipArchive zip = ZipFile.Open(zipfile, ZipArchiveMode.Read))
    foreach (ZipArchiveEntry entry in zip.Entries)
        if(entry.Name == "myfile")
            entry.ExtractToFile("myfile");

Find "myfile" in zipfile and extract it.

Sinatr
  • 20,892
  • 15
  • 90
  • 319
  • 44
    One can also use entry.Open() to just get the stream (if the contents should be read but not written to a file). – anre Apr 09 '14 at 17:26
  • 21
    references: `System.IO.Compression.dll` and `System.IO.Compression.FileSystem.dll` – yzorg Mar 26 '16 at 19:16
  • Is there a way to use this to get an exact file path within the zip? – Cullub Aug 23 '23 at 18:19
  • 1
    @Cullub, see [entry.FullName](https://learn.microsoft.com/en-us/dotnet/api/system.io.compression.ziparchiveentry.fullname) . – Sinatr Aug 24 '23 at 08:02
87

DotNetZip is your friend here.

As easy as:

using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
  ZipEntry e = zip["MyReport.doc"];
  e.Extract(OutputStream);
}

(you can also extract to a file or other destinations).

Reading the zip file's table of contents is as easy as:

using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
  foreach (ZipEntry e in zip)
  {
    if (header)
    {
      System.Console.WriteLine("Zipfile: {0}", zip.Name);
      if ((zip.Comment != null) && (zip.Comment != "")) 
        System.Console.WriteLine("Comment: {0}", zip.Comment);
      System.Console.WriteLine("\n{1,-22} {2,8}  {3,5}   {4,8}  {5,3} {0}",
                               "Filename", "Modified", "Size", "Ratio", "Packed", "pw?");
      System.Console.WriteLine(new System.String('-', 72));
      header = false;
    }
    System.Console.WriteLine("{1,-22} {2,8} {3,5:F0}%   {4,8}  {5,3} {0}",
                             e.FileName,
                             e.LastModified.ToString("yyyy-MM-dd HH:mm:ss"),
                             e.UncompressedSize,
                             e.CompressionRatio,
                             e.CompressedSize,
                             (e.UsesEncryption) ? "Y" : "N");

  }
}

Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:


Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • 11
    +1. Behind the scenes, what DotNetZip does in the constructor is seek to the "directory" inside the zipfile, and then read it and populate the list of entries. At that point, if your app calls Extract() on one entry, DotNetZip seeks to the proper place in the zip file, and decompresses the data for just that entry. – Cheeso May 11 '11 at 19:57
19

Something like this will list and extract the files one by one, if you want to use SharpZipLib:

var zip = new ZipInputStream(File.OpenRead(@"C:\Users\Javi\Desktop\myzip.zip"));
var filestream = new FileStream(@"C:\Users\Javi\Desktop\myzip.zip", FileMode.Open, FileAccess.Read);
ZipFile zipfile = new ZipFile(filestream);
ZipEntry item;
while ((item = zip.GetNextEntry()) != null)
{
     Console.WriteLine(item.Name);
     using (StreamReader s = new StreamReader(zipfile.GetInputStream(item)))
     {
      // stream with the file
          Console.WriteLine(s.ReadToEnd());
     }
 }

Based on this example: content inside zip file

Community
  • 1
  • 1
Javi
  • 265
  • 1
  • 8
16

Here is how a UTF8 text file can be read from a zip archive into a string variable (.NET Framework 4.5 and up):

string zipFileFullPath = "{{TypeYourZipFileFullPathHere}}";
string targetFileName = "{{TypeYourTargetFileNameHere}}";
string text = new string(
            (new System.IO.StreamReader(
             System.IO.Compression.ZipFile.OpenRead(zipFileFullPath)
             .Entries.Where(x => x.Name.Equals(targetFileName,
                                          StringComparison.InvariantCulture))
             .FirstOrDefault()
             .Open(), Encoding.UTF8)
             .ReadToEnd())
             .ToArray());
ShamilS
  • 1,410
  • 2
  • 20
  • 40
4

the following code can read specific file as byte array :

using ZipArchive zipArchive = ZipFile.OpenRead(zipFilePath);
        foreach(ZipArchiveEntry zipArchiveEntry in zipArchive.Entries)
        {
            if(zipArchiveEntry.Name.Equals(fileName,StringComparison.OrdinalIgnoreCase))
            {
                Stream stream = zipArchiveEntry.Open();
                using MemoryStream memoryStream = new MemoryStream();
                await stream.CopyToAsync(memoryStream);
                return memoryStream.ToArray();
            }
        }
Khaled Gomaa
  • 79
  • 1
  • 4
0

Zip files have a table of contents. Every zip utility should have the ability to query just the TOC. Or you can use a command line program like 7zip -t to print the table of contents and redirect it to a text file.

umilmi81
  • 146
  • 3
0

In such case you will need to parse zip local header entries. Each file, stored in zip file, has preceding Local File Header entry, which (normally) contains enough information for decompression, Generally, you can make simple parsing of such entries in stream, select needed file, copy header + compressed file data to other file, and call unzip on that part (if you don't want to deal with the whole Zip decompression code or library).

Nickolay Olshevsky
  • 13,706
  • 1
  • 34
  • 48