A Huffman Example

Statistical Compressors
Concept
Algorithm
Example
Comparison (H vs. SF)

An Example

Applying this process to the afore-mentioned 35-character text file, we can construct the table below

We start by choosing the two smallest nodes, which in this case are ‘D’ and ‘E’. We combine these two nodes into a new tree whose root is the sum of the weights chosen – in this case, 9. Then, we replace the two nodes with the combined tree. The field of candidates, i.e. the nodes that are evaluated at each stage, are shown in gray at each stage.

Next, we repeat this step, combining ‘B’ and ‘C’. We take these two nodes out, and, as in the first step, combine them into a tree of weight 12. Note that on each iteration, the number of nodes remaining in the selection shrinks by one, as we remove two nodes and replace them with a single root node.

Once again, we pull out the smallest nodes and build a tree of weight 21.Notice that the non-leaf nodes (the nodes that are not at the end) do not represent single characters. This is essential to maintain the prefix property discussed earlier.

And finally, we combine the last two nodes remaining in our queue to get our final tree, The root of the final tree will always have a weight equal to the number of characters in the input file, which in this case is 35.

Using the convention cited earlier, to read the codes from this Huffman tree, we start from the root and add a '0' every time you go left to a child, and add a '1' every time you go right. The table below shows the resultant bit codes for each character:

Symbol	Bit Code
A	0
B	100
C	101
D	110
E	111

Next step: Comparison between Huffman and Shannon-Fano