LZW Data Compression

Question

I'm looking for LZW compression algorithm in c# that can compress and decompress word documents. I've search it on google but it didn't give me the answer that i need. Can anyone help me to have the code for it and for me to understand how to really implement LZW in my project.

Possible [duplicate](http://stackoverflow.com/questions/6710014/lzw-compression-on-c-sharp-from-string) — Emmanuel N, Jan 03 '12 at 14:42
I'd ask the same - what specific reason is there to use LZW? Are you trying to compress/decompress while interacting with another system that uses LZW? If so then the implementations posted by users may not be compatible with your external system. Please clarify the question. Best regards, — Dr. Andrew Burnett-Thompson, Jan 03 '12 at 14:47
@Dr.AndrewBurnett-Thompson i created a file transfer project that would compress the file before transmission and then decompress the file when the transmission is complete. Also both computers had LZW compressor and decompressor. — Eric, Jan 03 '12 at 14:59
Ok so if you're using the same LZW implemenation both sides you could use anything really. GZip as Jon Skeet mentions, Sharp Zip Lib, or ultra fast implementations of LZW such as http://www.quicklz.com/ or http://www.fastlz.org/ — Dr. Andrew Burnett-Thompson, Jan 03 '12 at 15:02
@JonSkeet i just want to explore and gain better understanding about LZW. — Eric, Jan 03 '12 at 15:05
@Eric: For purely educational purposes, where understanding LZW is the ultimate goal, I'd try to implement it myself. For real purposes, where compression with simple code is the ultimate goal, I'd use the built-in Gzip implementation. — Jon Skeet, Jan 03 '12 at 15:18
LZW is actually good for streaming and piece-wise extraction (ie. given only offset into compressed data, assuming dictionary-to-point is known) of short strings in a compressed array. .NET's GZip (until 4.5.something) had a bad dictionary generation for short data, and even after that does not allow more .. intricate .. work with common / "pre-shared" dictionaries efficiently. This of course, is using a 'transparent' LZW implementation in a specific case.. — user2864740, Nov 06 '18 at 03:36
@user2864740 - can you explain exactly what do you expect more than the actual answers for the bounty? — Simon Mourier, Nov 06 '18 at 11:27
@SimonMourier Preferably answers related to "For purely educational purposes, where understanding LZW is the ultimate goal, I'd try to implement it myself." which contain *good* information and code - while this was *not* the original question asked (and it was *not* posed by the author). There is currently one answer herein (usable as a baseline), that appears to contain relevant information to the clarification added in the comment above. SO seems pretty sparse on this aspect of LZW .. exploration. Maybe I should just have asked another question? (And I'd like to award bounty anyway.) — user2864740, Nov 06 '18 at 17:55

score 3 · Answer 1 · answered Jan 28 '16 at 08:21

Here is the implementation of LZW which i used in my project :

namespace LZW
{
    public class Program
    {
        public static void Main(string[] args)
        {
            List<int> compressed = Compress("string to be compressed");
            Console.WriteLine(string.Join(", ", compressed));
            string decompressed = Decompress(compressed);
            Console.WriteLine(decompressed);
        }

        public static List<int> Compress(string uncompressed)
        {
            // build the dictionary
            Dictionary<string, int> dictionary = new Dictionary<string, int>();
            for (int i = 0; i < 256; i++)
                dictionary.Add(((char)i).ToString(), i);

            string w = string.Empty;
            List<int> compressed = new List<int>();

            foreach (char c in uncompressed)
            {
                string wc = w + c;
                if (dictionary.ContainsKey(wc))
                {
                    w = wc;
                }
                else
                {
                    // write w to output
                    compressed.Add(dictionary[w]);
                    // wc is a new sequence; add it to the dictionary
                    dictionary.Add(wc, dictionary.Count);
                    w = c.ToString();
                }
            }

            // write remaining output if necessary
            if (!string.IsNullOrEmpty(w))
                compressed.Add(dictionary[w]);

            return compressed;
        }

        public static string Decompress(List<int> compressed)
        {
            // build the dictionary
            Dictionary<int, string> dictionary = new Dictionary<int, string>();
            for (int i = 0; i < 256; i++)
                dictionary.Add(i, ((char)i).ToString());

            string w = dictionary[compressed[0]];
            compressed.RemoveAt(0);
            StringBuilder decompressed = new StringBuilder(w);

            foreach (int k in compressed)
            {
                string entry = null;
                if (dictionary.ContainsKey(k))
                    entry = dictionary[k];
                else if (k == dictionary.Count)
                    entry = w + w[0];

                decompressed.Append(entry);

                // new sequence; add it to the dictionary
                dictionary.Add(dictionary.Count, w + entry[0]);

                w = entry;
            }

            return decompressed.ToString();
        }
    }
}

Why would you use a list instead of a byte array, though? Also, uh, it's not just for strings... — Nyerguds, Jul 11 '17 at 11:59
it's not just for strings . ? What exactly you are trying to achieve , , you can modify it as per your requirement,also you can modify it to use byte array as well instead for int, This is just a sample example posted . Remember stack-over flow is just for reference and not a coding service :) — coder3521, Jul 12 '17 at 03:44
LZW is also used for non-text data compression. But it's really hard to find examples of that :( As for what I'm trying to achieve, I'm working on modding tools for old DOS games that use LZW in the storage of binary files. — Nyerguds, Jul 13 '17 at 10:11
Without feeding byte->byte(s), eg. as a stream, it feels wasteful. — user2864740, Nov 06 '18 at 07:30

Nyerguds · Answer 2 · 2018-11-06T15:33:29.663

3

For anyone stumbling on this... I found an exact C# implementation of the algorithm as described in the article on Mark Nelson's website on github, here:

https://github.com/pevillarreal/LzwCompressor

Personally, I further adapted the code to use MemoryStream instead of FileStream because I needed to convert byte arrays, not saved files, but that change is pretty trivial.

edited Nov 06 '18 at 15:33

answered Jan 11 '18 at 11:38

Nyerguds

5,360
1
31
63

Must-read the the "An efficient LZW implementation" link provided, as implemented in the github. It provides a critical implementation detail that makes the modified LZW ideal for shared dictionary cases (when using a String -> Coded mapping would take up too much space relative to the compressed data). This can be adapted to make a 'seekable' LZW implementation that is indexable. – user2864740 Nov 06 '18 at 05:28
Hmm, I mean this link ["An efficient LZW implementation"](http://warp.povusers.org/EfficientLZW/index.html). I thought it was from this answer, but.. not? – user2864740 Nov 06 '18 at 07:18
1

I actually wanted to link to the "LZW Data Compression" article (from Oct 1, 1989!) on Mark Nelson's website, but it seems I linked to the index. Fixed now. – Nyerguds Nov 06 '18 at 15:34
Well, I found that a bunch of old games, like the Westwood Studios ones, use this algorithm exactly :) – Nyerguds Nov 07 '18 at 10:24
They should be stream-compatible baring other tricks like dynamic symbol sizes.. the "efficient" is mostly about implementation [ie. constant size bounds], not overall compression rates. Anyway- Dune/Dune II? I find that the true successor of the modern RTS, but most people think SC or WC.. or even C&C :) – user2864740 Nov 07 '18 at 16:22
1

Oh, my bad.. Dune II only, not the 'prequel' :) – user2864740 Nov 07 '18 at 16:27
1

Pre-Dune II actually; their earlier BattleTech games, adventure games and RPGs. By the time of Dune II they stopped using LZW and switched to [their own algorithm](http://www.shikadi.net/moddingwiki/Westwood_LCW), of which the compressed data actually looks more like RLE. [Here's](http://www.shikadi.net/moddingwiki/Westwood_CPS_Format#Compression_types) the list of everything they used. – Nyerguds Nov 08 '18 at 13:16

score 3 · Answer 3 · answered Jan 03 '12 at 14:39

3

There is an implementation here.

LZW does not care what kind of file it is working with. Every file is treated as a blob of bytes.

answered Jan 03 '12 at 14:39

Drew Dormann

59,987
13
123
180

4

That's a bizarre and pretty awful implementation... it outputs the values as binary strings. – Nyerguds Jul 29 '17 at 14:06

score 2 · Answer 4 · answered Jan 03 '12 at 14:42

2

A c# implementation of LZW: http://code.google.com/p/sharp-lzw/

answered Jan 03 '12 at 14:42

voidengine

2,504
1
17
29

1

Go to "Source" and there either Browse the code or follow the instructions to checkout the project – voidengine Jan 03 '12 at 15:19

LZW Data Compression

4 Answers4

Linked