14

The following block is nested by Huffman block markers

-HUFF---------------------------------------------------------------------0084-
  10    0    1    2    4    3    4    6    5    6    8    a    9    4    2    3
   0    1    2   11    0    3    4   21    5   12   31    6   41   51   61   13
  22   71   81   91   a1   14   32   b1   d1   f0   15   23   35   42   b2   c1
   7   16   24   33   52   72   73   e1   25   34   43   53   62   74   82   94
  a2   f1   26   44   54   63   64   92   93   c2   d2   55   56   84   b3   45
  83   46   a3   e2
-------------------------------------------------------------------------------

0084 is the length of the table as an integer and is not included in the block here

according to the JPEG standard, the first address aparently makes it an AC table at destination 0 (0x10)

and aparently from there onwards it's a huffman table.

So, how is it decoded?

caf
  • 233,326
  • 40
  • 323
  • 462
Supernovah
  • 1,946
  • 12
  • 34
  • 50
  • "0084 is the length of the table as an integer" - you mean as an integer in decimal (base 10). All these numbers are integers. The length was originally 0x0054 - an interger in hex - and you converted it to decimal for us. – Robino Aug 15 '21 at 16:47
  • What should have been obvious by my note was that this was printed using a format specifier in the `stdio.h`'s `printf` function containing `%04i` which is the specifier for signed integers with leading zeroes and at least four numerals. – Supernovah Aug 16 '21 at 03:20
  • I agree that it must be obvious if one is accustomed to equating the concept of integers with that of whole-number decimals. In the context of a sea of bytes, representing (whole) numbers written in hex, and in the very position where an actual DHT would have the same length, itself represented as two bytes in the same format as the rest of the data, it was less obvious. – Robino Aug 16 '21 at 11:44
  • (In a DHT, the length of the table is the two bytes immediately preceeding the table data.) – Robino Aug 16 '21 at 11:46

1 Answers1

22

The next 16 bytes after the 0x10 tell you how many codes of each length. In your example, there are 0 codes of length 1 bit, 1 code of length 2 bits, 2 codes of length 3 bits, 4 codes of length 4 bits, 3 codes of length 5 bits, and so on.

These are then followed by the values that are encoded by those codes, in order. Again from your example:

Code length | Number | Symbol(s)
------------+--------+----------
1 bit       | 0      |
2 bits      | 1      | 0x01
3 bits      | 2      | 0x02 0x11
4 bits      | 4      | 0x00 0x03 0x04 0x21
5 bits      | 3      | 0x05 0x12 0x31
... etc

You then build a binary tree from the top down, assigning the symbols in order. In this example, you get:

Symbol | Code 
-------+------
0x01   | 00
0x02   | 010
0x11   | 011
0x00   | 1000
0x03   | 1001
0x04   | 1010
0x21   | 1011
...etc
caf
  • 233,326
  • 40
  • 323
  • 462
  • thanks a lot man this has helped in my understanding a lot :) – Supernovah Oct 16 '09 at 11:38
  • Edit: Reread the answer, it is **first length** of codes, **then codewords**! (Is it always 16 codewords?), then I know the answer to my question! Thx -- Original Comment: How do you get from the Number of Symbols to the actual Symbols. How do I know that the one 2bits symbol is 0x01, and not 0x00, 0x10 or 0x11? ... – muuh Apr 12 '18 at 09:21