Neural Networks - Applications

Neural Networks and Image Compression
Because neural networks can accept a vast array of input at once, and process it quickly, they are useful in image compression.

Bottleneck-type Neural Net Architecture for Image Compression Compression image 1

Image courtesy of: http://neuron.eng.wayne.edu/bpImageCompression9PLUS/bp9PLUS.html

Here is a neural net architecture suitable for solving the image compression problem. This type of structure is referred to as a bottleneck type network, and consists of an input layer and an output layer of equal sizes, with an intermediate layer of smaller size in-between. The ratio of the size of the input layer to the size of the intermediate layer is - of course - the compression ratio.

This is the same image compression scheme, but implemented over a network. The transmitter encodes and then transmits the output of the hidden layer, and the receiver receives and decodes the 16 hidden outputs to generate 64 outputs.

Bottleneck architecture for image compression over a network or over time

Compression image 2
Image courtesy of: http://neuron.eng.wayne.edu/bpImageCompression9PLUS/bp9PLUS.html

Pixels, which consist of 8 bits, are fed into each input node, and the hope is that the same pixels will be outputted after the compression has occurred. That is, we want the network to perform the identity function.

Actually, even though the bottleneck takes us from 64 nodes down to 16 nodes, no real compression has occurred. The 64 original inputs are 8-bit pixel values. The outputs of the hidden layer are, however, decimal values between -1 and 1. These decimal values can require a possibly infinite number of bits.

How to get around this? The answer is...

Quantization for Image Compression
We do not need to be able to represent every possible decimal value, but instead can clump the possible values into, say, 3-bit units. 8 possible binary codes can be formed: 000, 001, 010, 011, 100, 101, 110, 111, and each of these codes represents a range of values for a hidden unit output. For example, when a hidden value is between -1.0 and -.75, the code 000 is transmitted.

Hence, the image is compressed from 64 pixels * 8 bits each = 512 bits to 16 hidden values * 3 bits each = 48 bits : the compressed image is about 1/10th the size of the original!

However, this encoding scheme is not lossless; the original image cannot be retrieved because information is lost in the process of quantizing. Nonetheless, it can produce quite good results, as shown (though, of course, this image has been turned into a jpeg as well, so the actual results of the original compressopm cannot be seen. However, the contrast between the two images (or the lack of contrast, as is the goal of compression!) should be sufficient for this brief discussion.

ORIGINAL	RECONSTRUCTED IMAGE

How does a network learn to do this?
** See neural network learning and image compression firsthand at: http://neuron.eng.wayne.edu/bpImageCompression9PLUS/bp9PLUS.html**

The goal of these data compression networks is to re-create the input itself. Hence, the input is used as its own output for training purposes. The input (for the applet) is produced from the 256x256 training image by extracting small 8x8 chunks of the image chosen at a uniformly random location in the image. This data is presented over and over, and the weights adjusted, until the network reproduces the image relatively faithfully. You can see this randomization of the 8x8 chunks in the top row of the applet. The randomization helps ensure generalization of the network so that it will work on other images as well.

Once training is complete, image re-construction is demonstrated in the recall phase. In this case, we still present the neural net with 8x8 chunks of the image, but now instead of randomly selecting the location of each chunk, we select the chunks in sequence from left to right and from top to bottom. For each such 8x8 chunk, the output the network can be computed and displayed on the screen to visually observe the performance of neural net image compression.

We can test how well the network has generalized the data by testing image compression on other pictures, such as the one on the bottom. And we can continue to train the network if the output is not of high enough quality.

Back to Applications