1

Basically I'm trying to compress a file "sample.doc" into the .gz file format. When this happens, it is told to remove the extension of the file so instead of appearing as "sample.doc.gz" it appears as "sample.gz". However, when the file is extracted it has also lost its ".doc" file extension. eg. filename is just "sample". Any ideas?

using System; using System.IO; using System.IO.Compression; using System.Text;

namespace gzipexample {

class Program
{
    public static void Main()
    {
        CompressFile(@"C:\sample.doc");
   }


    //Compresses the file into a .gz file
    public static void CompressFile(string path)
    {
        string compressedPath = path;
        Console.WriteLine("Compressing: " + path);

        int extIndex = compressedPath.LastIndexOf(".");
        FileStream sourceFile = File.OpenRead(path);
        FileStream destinationFile = File.Create(compressedPath.Replace(compressedPath.Substring(extIndex), "") + ".gz");

        byte[] buffer = new byte[sourceFile.Length];

        sourceFile.Read(buffer, 0, buffer.Length);

        using (GZipStream output = new GZipStream(destinationFile,
            CompressionMode.Compress))
        {
            Console.WriteLine("Compressing {0} to {1}.", sourceFile.Name,
                destinationFile.Name, false);

            output.Write(buffer, 0, buffer.Length);
        }

        // Close the files.
        sourceFile.Close();
        destinationFile.Close();
    }
}

}

karlstackoverflow
  • 3,298
  • 6
  • 30
  • 41

1 Answers1

1

If I'm understanding your question correctly, there is no solution as stated. A gzip'ed file (at least, a file gzip'ed the way you're doing it) doesn't store its name, so if you compress a file named sample.doc and the output is named sample.gz, the ".doc" part is gone forever. That's why if you compress a file with the gzip command-line utility, it the compressed version sample.doc.gz.

In some constrained situations, you might be able to guess an appropriate extension by looking at the contents of the file, but that isn't very reliable. If you just need compression, and the file format isn't constrained, you could just build a .zip file instead, which does store filenames.

servn
  • 3,049
  • 14
  • 8
  • Is there an alternative method I could use that would store the file names? – karlstackoverflow May 03 '11 at 05:24
  • Assuming you control both the compression and decompression tools, there are a variety of solutions: for example, you could write the name of the file to the beginning of the file, and extract it out in your decompression tool. Or you could write out a .zip file. – servn May 03 '11 at 05:39
  • Any chance you could show me an example of the .zip method? I tried to change to .gz and it still doesnt work. I wont be decompressing the files for this either so I wont be able to do that way. – karlstackoverflow May 03 '11 at 05:41
  • What exactly are you using to decompress the files, if it's not a tool you control? – servn May 03 '11 at 05:50
  • Oh... well, http://dotnetzip.codeplex.com/ is one library that will write .zip files. – servn May 03 '11 at 05:57
  • Unfortunately I can only use gzip as its packaged with VS :( – karlstackoverflow May 03 '11 at 05:58
  • See http://stackoverflow.com/questions/940582/how-do-i-zip-a-file-in-c-using-no-3rd-party-apis ? I don't have any other suggestions. – servn May 03 '11 at 06:19