10

I have some code that downloads gzipped files, and decompresses them. The problem is, I can't get it to decompress the whole file, it only reads the first 4096 bytes and then about 500 more.

Byte[] buffer = new Byte[4096];
int count = 0;
FileStream fileInput = new FileStream("input.gzip", FileMode.Open, FileAccess.Read, FileShare.Read);
FileStream fileOutput = new FileStream("output.dat", FileMode.Create, FileAccess.Write, FileShare.None);
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress, true);

// Read from gzip steam
while ((count = gzipStream.Read(buffer, 0, buffer.Length)) > 0)
{
    // Write to output file
    fileOutput.Write(buffer, 0, count);
}

// Close the streams
...

I've checked the downloaded file; it's 13MB when compressed, and contains one XML file. I've manually decompressed the XML file, and the content is all there. But when I do it with this code, it only outputs the very beginning of the XML file.

Anyone have any ideas why this might be happening?

Edgar
  • 4,348
  • 4
  • 40
  • 59

4 Answers4

4

EDIT

Try not leaving the GZipStream open:

GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress,  
                                                                         false);

or

GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress);
David Neale
  • 16,498
  • 6
  • 59
  • 85
1

Same thing happened to me. In my case only reads up to 6 lines and then reached end of file. So I realized that although the extension is gz, it was compressed by another algorithm not supported by GZipStream. So I used SevenZipSharp library and it worked. This is my code

You can use SevenZipSharp library

using (var input = File.OpenRead(lstFiles[0]))
{
    using (var ds = new SevenZipExtractor(input))
    {
        //ds.ExtractionFinished += DsOnExtractionFinished;

        var mem = new MemoryStream();
        ds.ExtractFile(0, mem);

        using (var sr = new StreamReader(mem))
        {
            var iCount = 0;
            String line;
            mem.Position = 0;
            while ((line = sr.ReadLine()) != null && iCount < 100)
            {
                iCount++;
                LstOutput.Items.Add(line);
            }

        }
    }
} 
rudolf_franek
  • 1,795
  • 3
  • 28
  • 41
UUHHIVS
  • 1,179
  • 11
  • 19
1

I ended up using a gzip executable to do the decompression instead of a GZipStream. It can't handle the file for some reason, but the executable can.

Edgar
  • 4,348
  • 4
  • 40
  • 59
  • Could you post your final version please. I would like to see exactly how you use gzip executable. – ManInMoon Nov 19 '14 at 11:07
  • Sorry, I don't have access to the code anymore. It used the Process class to call the gzip executable. This might help: http://www.dotnetperls.com/7-zip – Edgar Nov 20 '14 at 07:42
0

Are you calling Close or Flush on fileOutput? (Or just wrap it in a using, which is recommended practice.) If you don't the file might not be flushed to disk when your program ends.

Ruben
  • 15,217
  • 2
  • 35
  • 45
  • All 3 streams are closed after the reading is done. The problem is not that the data isn't written correctly to the output file, but that the Read() doesn't read the whole input file. Could something be interrupting it? It reads the exact number of bytes every time before it stops, which is curious. – Edgar Jun 18 '10 at 09:50
  • After the first read it's 4096, the second read is 532, and then it stops. – Edgar Jun 18 '10 at 10:14
  • Try not leaving the stream open in the cctor. – David Neale Jun 18 '10 at 10:47