 Statistical
Compressors
Concept
Algorithm
Example
Comparison (H vs. SF) |
The Algorithm For a given list of symbols, develop a corresponding list of
probabilities or frequency counts so that each symbols relative frequency of
occurrence is known. Sort the lists of symbols according to frequency, with the most
frequently occurring symbols at the top and the least common at the bottom. Then:
- Divide the list into two parts, with the total frequency counts of the upper half being
as close to the total of the bottom half as possible.
- The upper half of the list is assigned the binary digit 0, and the lower half is
assigned the digit 1. This means that the codes for the symbols in the first half will all
start with 0, and the codes in the second half will all start with 1.
- Recursively apply the same procedure to each of the two halves, subdividing groups and
adding bits to the codes until each symbol has become a corresponding code leaf on the
tree.
|