0

I am trying to read the contents of file and then write them to a different file with Huffman coding. So I have created a Huffman Tree, with each node containing the character, frequency of the character, and a binary string representing the frequency. What I am having trouble understanding is writing the Huffman coded characters to a file.

I tried writing the binary string to the file but realized that it was just writing string and not actually the coded data. So I then converted the binary strings to bytes and wrote the bytes to the file but that would just give me a blank file the same size as the original. I feel like I am missing something when it comes to actually writing the file.

Edit: After taking a look back at my code I realized that my Tree wasn't completely correct and I'm now able to (I think) combine bit string together to make a byte array that I can write to a file (code updated to reflect that). For my test case I am reading in the text AAA_BB_C but when I look at the file the output is <0x1e>. I'm not sure what this means. I was expecting the same output of the original file, just a smaller size.

public static void writeFile(HuffTree tree) {
    String bin = ""; // String of entire binary code
    int spot = 0; // Spot in array
    byte[] bytes = new byte[256]; // byte array
    try {
        FloatileWriter writer = new FileWriter("test(encoded).txt");

            // Gets Binary String of each Character in the file
            for(int i = 0; i < fileText.length(); i++) {
                bin += tree.findDataBinary(fileText.charAt(i));
            }

            // Takes each bit and adds to byte array 
            System.out.println(bin);
            while(bin.length() > 7) {
                String temp = bin.substring(0, 7);
                bin = bin.substring(7, bin.length());
                bytes[spot] = Byte.parseByte(temp, 2);
                spot++;
            }

            // Writes bytes to file
            for(int i = 0; i <= spot; i++) {
                writer.write(bytes[i]);
            }
            writer.close();
    } catch(IOException e) {
        System.out.println("IOException!");
    }
}
billybob643
  • 13
  • 1
  • 5
  • 1
    What is `fileText`? --- What is the point of `map`? --- `byte` is a *primitive* and doesn't have any methods, so what did you expect `tempByte.array()` to do? Create a 1-byte array? --- Your code seems to just map each character with a 1-byte value, so not much data *compression* going on. Seems you missed the entire point of Huffman. You sure you understand how Huffman works? It works with bits. so you need to do bit-manipulation. – Andreas Jun 10 '19 at 21:55
  • fileText is the the text input from the file. This I saved when reading it. I am having trouble conceptually understanding where the encoding actually happens. Once I have the correct binary values associated with each character I'm unsure what to do. The map was my attempt to pair the binary value associated with the character (in byte form) and then write the bytes into the file with the map. – billybob643 Jun 10 '19 at 22:01
  • 1
    What is the return value from `findDataBinary()`? Presumably it's a `String`, since that's what `parseByte()` requires, but is it a bit-string like `"10011"`? If so, then calling `parseByte()` to turn it into the *decimal* number 10011 seems meaningless. Wouldn't you at the very least want to parse it as binary, e.g. base-2? --- And let's say the first letter maps to `111` and the second letter maps to `10011`, that would mean that they combined maps to `11110011`, which is 8 bits and hence a single byte, compressing 2 letters into 1 byte, but you do no merging of bits. Step back and try again – Andreas Jun 10 '19 at 22:19
  • Yes `findDataBinary()` returns a bit-string of a character. My byte array should also work as well now. But I'm still stuck. – billybob643 Jun 11 '19 at 19:49
  • *"I was expecting the same output of the original file, just a smaller size"* Why on earth would you expect a text file being compression into a binary file to be the "same output"? – Andreas Jun 11 '19 at 20:16
  • Your updated code isn't writing the last byte. --- Also, how would you decompress the data? Isn't the tree dynamically built from the `fileText`, so without the `fileText`, you don't have a tree, do you wouldn't understand bytes in the file. – Andreas Jun 11 '19 at 20:16
  • I fixed the code to be able to write all the bytes. I had I misunderstanding of what the output of the file was supposed to look like (pretty sure I have the correct output now `0f2f 00`). I only need to compress the data, but if I wanted to decompress the data what would I need to include. `fileText` is the string of the text in the original file. – billybob643 Jun 12 '19 at 02:23
  • You'd of course need the tree data as part of the file. – Andreas Jun 12 '19 at 02:42

1 Answers1

1

Here is the code on Huffman.java

https://algs4.cs.princeton.edu/55compression/Huffman.java.html

It kind of like How to write to a file in Java after Huffman Coding is done

caot
  • 3,066
  • 35
  • 37