0

I'm trying to decode a Huffman tree of the form:

001A1C01E01B1D

I'm using an implementation found here: Efficient way of storing Huffman tree to encode the tree in the form above and decode it as well.

This is my implementation of it:

HuffmanNode* HuffmanTree::decodeTree(string tree, int idx) {
    cout << idx << endl;
    if (tree[idx] == '1') {
        idx += 2;
        return new HuffmanNode(tree[idx - 1]);
    }
    else {
        if (idx != tree.length() - 1) {
            idx++;
            HuffmanNode* leftChild = decodeTree(tree, idx);
            idx++;
            HuffmanNode* rightChild = decodeTree(tree, idx);
            return new HuffmanNode(leftChild, rightChild);
        }
        else
            return new HuffmanNode(tree[idx]);
    }
}

I'm getting an access violation writing a location when the function unwinds (On "return new HuffmanNode(tree[idx - 1]);" ), and I'm hoping that the final return would be the root of the tree, but upon further inspection this doesn't seem to be the case. Can anyone give me some pointers? (No pun intended)

Community
  • 1
  • 1
Taylor Bishop
  • 479
  • 2
  • 5
  • 18
  • are you expecting idx to be shared between the recursive calls? if so you'll need to make it a reference in the parameter list. Also you could probably do with adding in some bounds safety checks (with appropriate errors); as it stands this would be tripped up by any string ending in "1", for example. – Dave Apr 02 '15 at 19:06

1 Answers1

1

The problem with your code is that idx is not modified inside recursive runs. Pass it into the function as int &: HuffmanNode* HuffmanTree::decodeTree(string tree, int &idx)

There is also one more bug in your code, which makes it segfault: instead of

if (tree[idx] == '1') {
    idx += 2;
    return new HuffmanNode(tree[idx - 1]);
}

you should have

if (tree[idx] == '1') {
    ++idx;
    return new HuffmanNode(tree[idx]);
}

Another 1 is being added to index within the second block:

idx++;
HuffmanNode* leftChild = decodeTree(tree, idx);
idx++;
HuffmanNode* rightChild = decodeTree(tree, idx);

Also, consider doing a thing, similar to the example you have linked to: pass reference to a string iterator, (or an istringstream, or some other stream) and don't pass an index: HuffmanNode* HuffmanTree::decodeTree(std::string::const_iterator &tree).

Also, you don't have to do checks like if (idx != tree.length() - 1) if the tree is well-formed. You may still do that at the start of the function to check for errors in your input, along with checks that current symbol is either '0' or '1'.

Kolmar
  • 14,086
  • 1
  • 22
  • 25
  • What do you mean if it is well-formed? It has to have some sort of base case doesn't it to stop calling the function and potentially reaching beyond the end of the string? I tried using an iterator but I had the same issue - I had no way of knowing when the end of the input stream was and whether or not I was finished reading in the tree itself. – Taylor Bishop Apr 05 '15 at 20:44
  • @TaylorBishop No, if there are no bugs in the implementation, and the encoded string is correct, this algorithm can't read beyond the end of the string. Look at the post you linked. It's posible to have actual data in the same string/array as the tree. So you can start decoding data from the index where you finish decoding the tree. All the internal nodes in the encoded tree are marked by 0, and the leaves by 1. After you read a node marked by 1, you don't make any more recursive calls. That's the base case. – Kolmar Apr 05 '15 at 22:04