4

I have been reading dictionary based compression algorithms including LZW and LZSS. Then, I wanted to implement LZW using Java and started working on it. I am not a developer and so I suspect my implementation may not be efficient. Can you look at the code and tell me what is wrong or inefficient in my implementation? Here is the full code.

public class LZW {

    public HashMap compdic, decompdic;
    String fileName = "walaloo.txt";
    short lastcode = 0, dlastcode = 0;

    LZW() {
        compdic = new HashMap<String, Integer>();
        decompdic = new HashMap<Integer, String>();
        createDictionary();
    }

    public void createDictionary() {
        try {
            short code;
            char ch;
            FileInputStream fis = new FileInputStream(fileName);
            InputStreamReader rdr = new InputStreamReader(fis, "utf-8");
            while ((code = (short) rdr.read()) != -1) {
                ch = (char) code;

                if (!compdic.containsKey(ch)) {
                    compdic.put("" + ch, code);
                    decompdic.put(code, "" + ch);
                    if (code > lastcode) {
                        lastcode = code;
                        dlastcode = code;
                    }
                }
            }
            fis.close();
        } catch (Exception ex) {
            Logger.getLogger(LZW.class.getName()).log(Level.SEVERE, null, ex);
        }
    }

    public void compressFile() {
        try {
            short code, codeword;
            char c;
            String s;

            System.out.print("Compressing...");
            FileInputStream fis = new FileInputStream(fileName);
            InputStreamReader rdr = new InputStreamReader(fis, "utf-8");
            FileOutputStream fos = new FileOutputStream(fileName + "1.lzw");
            ObjectOutputStream fout = new ObjectOutputStream(fos);

            s = (char) rdr.read() + "";
            while ((code = (short) rdr.read()) != -1) {
                c = (char) code;

                if (!compdic.containsKey(s + c)) {
                    codeword = Short.parseShort(compdic.get(s).toString());

                    fout.writeShort(codeword);
                    compdic.put(s + c, ++lastcode);
                    s = "" + c;
                } else {
                    s = s + c;
                }
            }

            codeword = Short.parseShort(compdic.get(s).toString());
            fout.writeShort(codeword);
            fout.writeShort(00);

            fout.close();
            fis.close();

            System.out.print("done");

        } catch (Exception ex) {
            Logger.getLogger(LZW.class.getName()).log(Level.SEVERE, null, ex);
        }
    }

    public void decompressFile() {
        short priorcode = -1, codeword = -1;
        char c;

        String priorstr, str;
        FileInputStream fis; 
        FileWriter fos; 
        ObjectInputStream fin;

        try {
            fis = new FileInputStream(fileName + "1.lzw");
            fos = new FileWriter(fileName + "2.txt");
            fin = new ObjectInputStream(fis);

            System.out.print("\nDecompressing...");
            priorcode = fin.readShort();
            fos.write(decompdic.get(priorcode).toString());
            while ((codeword = fin.readShort()) != -1) {
                if(codeword == 00)
                    break;

                priorstr = decompdic.get(priorcode).toString();

                if (decompdic.containsKey(codeword)) {
                    str = decompdic.get(codeword).toString();
                    fos.write(str);
                    decompdic.put(++dlastcode, priorstr + str.charAt(0));
                } else {
                    decompdic.put(++dlastcode, priorstr + priorstr.charAt(0));
                    fos.write(priorstr + priorstr.charAt(0));
                }

                priorcode = codeword;
            }

            fos.close();
            fis.close();
            System.out.print("done\n");

        } catch (Exception ex) {
            //Logger.getLogger(LZW.class.getName()).log(Level.SEVERE, null, ex);
            System.out.println("\n\nError: " + ex.getMessage());
            System.out.print(codeword + " " + priorcode + " " + decompdic.get(133) + " " + dlastcode);
        }
    }

    public static void main(String args[]) {
        LZW lzw = new LZW();
        lzw.compressFile();
        lzw.decompressFile();
    }
}
birraa
  • 430
  • 1
  • 4
  • 15
  • 1
    Is this code already working and you only look for improvements? Then you might better ask on http://codereview.stackexchange.com/ If there are any (known) problems in the code, you should state them so that we have an idea what your problems are. – Matthias Wimmer Sep 26 '15 at 08:46
  • This looks like it would be a good fit for Code Review indeed, if the code works as intended, which it sounds like it does. – Phrancis Sep 26 '15 at 08:57
  • The code works fine when I tested. I didn't find any problem. My biggest suspicion is that I used short when creating compression dictionary. That means some languages which use non-Latin alphabet whose Unicode code is greater than 32767 can't be handled by this. I didn't test that yet. – birraa Sep 26 '15 at 10:07
  • `works fine when I tested` How did you test? Suggestion: 1) compress 2) rename input file 3) try decompress *using a new instance*, if not process. Expected outcome: havoc - your decompression seems to need the plain text to even set-up `decompdic`. – greybeard Mar 24 '19 at 22:50

1 Answers1

1

The size of your dictionary is 32,767 items. (Short) You do not limit its size and do not check it. It works fine for small files. Data is lost for larger files.