![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() |
![]() Neural Networks and Image Compression Because neural networks can accept a vast array of input at once, and process it quickly, they are useful in image compression.
This is the same image compression scheme, but implemented over a network. The transmitter encodes and then transmits the output of the hidden layer, and the receiver receives and decodes the 16 hidden outputs to generate 64 outputs. Pixels, which consist of 8 bits, are fed into each input node, and the hope is that the same pixels will be outputted after the compression has occurred. That is, we want the network to perform the identity function. Actually, even though the bottleneck takes us from 64 nodes down to 16 nodes, no real compression has occurred. The 64 original inputs are 8-bit pixel values. The outputs of the hidden layer are, however, decimal values between -1 and 1. These decimal values can require a possibly infinite number of bits. How to get around this? The answer is... Quantization for Image Compression Hence, the image is compressed from 64 pixels * 8 bits each = 512 bits to 16 hidden values * 3 bits each = 48 bits : the compressed image is about 1/10th the size of the original! However, this encoding scheme is not lossless; the original image cannot be retrieved because information is lost in the process of quantizing. Nonetheless, it can produce quite good results, as shown (though, of course, this image has been turned into a jpeg as well, so the actual results of the original compressopm cannot be seen. However, the contrast between the two images (or the lack of contrast, as is the goal of compression!) should be sufficient for this brief discussion.
How does a network learn to do this?
The goal of these data compression networks is to re-create the input itself. Hence, the input is used as its own output for training purposes. The input (for the applet) is produced from the 256x256 training image by extracting small 8x8 chunks of the image chosen at a uniformly random location in the image. This data is presented over and over, and the weights adjusted, until the network reproduces the image relatively faithfully. You can see this randomization of the 8x8 chunks in the top row of the applet. The randomization helps ensure generalization of the network so that it will work on other images as well. Once training is complete, image re-construction is demonstrated in the recall phase. In this case, we still present the neural net with 8x8 chunks of the image, but now instead of randomly selecting the location of each chunk, we select the chunks in sequence from left to right and from top to bottom. For each such 8x8 chunk, the output the network can be computed and displayed on the screen to visually observe the performance of neural net image compression. We can test how well the network has generalized the data by testing image compression on other pictures, such as the one on the bottom. And we can continue to train the network if the output is not of high enough quality. |