hello thereaaaa Chronological outline
1970s - present
Useful links
Back to table of contents


1948: Claude Shannon Creates the Modern Information Theory

While working for Bell Laboratories, mathematician Claude Shannon, in association with Warren Weaver, wrote his paper titled, A Mathematical Theory of Communication. This paper showed that information is not just an abstract concept, but an actual entity that can be measured and mathematically manipulated. Shannon formed the basic model of communication that has now become the information theorist's coat of arms (Fig. 1).

Figure 1 - Claude Shannon's information model

By making information something physical, he was able to create rules and theorems that would shape the way information would be transmitted. These rules still apply as much today as they did a half century ago. Shannon's model showed that any given message is taken from a finite set of messages of given probabilities. In order to send that message, it can be encoded by a transmitter into binary digits, or bits, as he called them. Using bits, any message can be broken down into a series of zero's and one's and sent across a channel. In analyzing the way messages are formed and sent, Shannon made it possible to encode messages using an efficient, minimal-redundancy code.

Other ideas given in Shannon's paper include how fast information is sent through a given channel and what the maximum capacity of a channel is. One of his greatest theorems in his paper included a proof showing how data can be sent over a noisy channel with almost zero error. Given this new mathematical groundwork, Shannon paved the way for other scholars to pick up his work in this new field of information theory.

1949: Claude Shannon and Communication in the Presence of Noise

After Claude Shannon wrote A Mathematical Theory of Communication in 1948, he followed up with another paper titled Communication in the Presence of Noise. This paper provided a geometrical representation of how information is transmitted over function spaces. In this paper, Shannon was able to deduce new theories about the bandwidth of a given channel and the threshold effect. With new mathematical models, Shannon was able to come up with new ideas of how information can be sent in non-ideal conditions. His new theories took into account new factors such as the power of the transmission and bandwidth of the channel.


1950: Richard Hamming and Error Correction

Richard Hamming worked with finding ways for correcting errors in binary transmissions. His idea was that messages could be sent in 7-bit form with four information bits combined with three check bits. By checking the oddness or evenness of different combinations of information bits, the receiver can be certain whether a message has been sent error-free. If the received bits do not agree with the original calculations, it can be calculated where the error will most likely exist. This is a fairly basic error-correction technique but is not the most efficient. For every packet of date transmitted, only 4/7 (about 57%) of it is the actual message and the rest are check bits. Furthermore, Hamming's procedure only guards against one error in a 7-bit block and cannot guarantee that a message with more than one error will be corrected (or even discovered).

1951: Claude Shannon and Entropy of Printed English

In his original paper, A Mathematical Theory of Communication, Shannon described entropy as the average number of binary digits, or bits, per letter needed to translate a given message. In 1951, he wrote a new paper titled Prediction and Entropy of Printed English. He wanted to make further statistical analysis as to how to encode the language as efficiently as possible. He figured that one could predict a certain letter given a string of preceding letters (based on letter probability of the English language).

1952: David A. Huffman and Minimum-Redundancy Codes


Shannon defined "minimal-redundancy" as the lowest possible message length given D coding digits. While Shannon wrote proofs on this topic, David A. Huffman gave an actual coding technique to achieve minimal redundancy in his paper, "A Method of the Construction of Minimum-Redundancy Codes" written in 1952. Given certain restrictions of how a set of N messages can be arranged, it is possible to construct a code that yields the lowest possible message length. Huffman placed these restrictions to message codes in order to achieve the lowest possible message ensemble code:

  1. No two messages will consist of identical arrangements of coding digits (if there are identical arrangements, then one message can be eliminated from the set without any lose of information).
  2. The message codes will be constructed in such a way that no additional indication is necessary to specify where a message code begins and ends once the starting point of a sequence is know. In other words, the code cannot appear, digit for digit, as the first part of any message code of greater length.
  3. The messages must be arranged in a way that the length, L(I), of a given message code is never less than the length of a more probable code.
  4. If P(x) is the probability of a message being chosen then:

    P(1) > P(2) > . . . > P(N – 1) >P(N)


    L(1) < L(2) < L(N – 1) = L(N)

    L(N – 1) equals L(N) because the first L(N – 1) digits of the Nth message cannot be code used in any other message. If more digits were added to the Nth message, then the L(N –1) message would be encapsulated as the first part of the Nth message. This would be a violation of restriction (b).

  5. At least two and not more than D of the messages with code length L(N) have codes which
  6. are alike except for their final digits.

  7. Each possible sequence of L(N) – 1 digits must be used either as a message code or must

have one of its prefixes used as a message code. This ensures that the Nth message has as

few digits as possible hence reducing the total number of digits in the entire message.

Huffman provided ways of manipulating messages so that these restrictions could be met. Once this is in place, he gave the following instructions:

  1. Since the last two messages are exactly the same except for their last digit, the code (which is not yet determined) will be the common prefixes (the first k digits of a code) of order L(N)-1 of the two messages.
  2. A new set of messages can be constructed with one less member because two of the messages are now combined.
  3. Manipulate the messages until the restrictions, (a) through (e), are again met.
  4. Repeat this process until all the messages are encoded.

This new encoded scheme is like signs on a river. By following the restrictions, the encoded scheme provides plans for a "path" to recreate the original messages.

Figure 2: This table shows messages being combined into one composite message.

Because these messages have certain constraints at each step of the encoding, Huffman’s technique ensures that the message is encoded with the lowest average length. This technique (known as Huffman coding) not only has its effects in information theory, but also in fields such as data compression. Every time someone uses an "mp3" file, they are reconstructing messages that have been compressed into a small ensemble code.

1952: E.N. Gilbert and Error Correcting Alphabets

Shannon's theorems prove that there exist coding schemes where information can be transmitted with arbitrarily small amounts of errors. In 1952, E.N. Gilbert constructed a method for finding how to send messages with small amounts of error in his paper, A Comparison of Signaling Alphabets. Gilbert's idea was that a binary alphabet could be constructed where an error in any letter could not be interpreted as any other letter. So, even if a signal is received with errors, the receiver will still know what message was sent from the transmitter. Gilbert's code offers an advantage over Shannon's code in that Shannon's is reliable on average, but there are certain probabilities of coding errors that are above average. Gilbert's, while maybe not as fast and efficient as Shannon's, provides greater error correcting for any given message.

1954: Amiel Feinstein and Noisy Channels

In his paper, A New Basic Theorem of Information Theory, Amiel Feinstein provides new proofs and theorems for Shannon's original ideas. In particular, Feinstein concentrated his work on noisy channels and continuous wave channels. Like other mathematicians, Feinstein investigated the question of channel capacity and minimizing error. This highly mathematical paper rethinks Shannon's theories and proves them using different techniques never used by information theory's inventor. Feinstein discovered new bounds and limits for a given noisy channel and a new mathematical representation that still agrees with Shannon's original theorem but has a more rigorous base. His large contribution to the field of information theory is that he found that the probability of error decays exponentially with the block length of a coded message.

1955 - 1957: Coding for Noisy Channels

After Feinstein's paper on noisy channels and minimizing errors, several new papers were written trying to see if it is possible to code such a channel where errors are almost non-existent. In 1955, Peter Elias produced his paper, Coding for Noisy Channels in which he tries to outline practical, realistic ways of sending codes with minimal error. His idea required a channel with memory. This means that the receiver can check his received signals with the sent signals (hence the signals are stored in memory). If the received signal has errors, the receiver can check the transmitter for the closest possible match. However, such an idea cannot be put into practice because of the large amounts of memory and time needed to check the results against the original information. Furthermore, such a technique would require a special feedback line that would be as error-prone as any other channel. Elias continued work with Hamming's check-symbol code and produced a model for error-free transmitting. However, Elias admits that such a method would require large amounts of time.

A year later, Shannon wrote The Zero Error Capacity of a Noisy Channel. With Elias new idea of a feedback channel, Shannon was able to write proofs of how a zero-error transmission could exist, but offers no practical way of creating such a channel.

Figure 3 - Now do you see why noise and interference are such a large issues?


1960 : Reed-Solomon Code and Error-Correction

Even after twenty years of theories and methods, finding an effective way of error-correction was still elusive. However, in 1960, Irving Reed and Gustave Solomon came up with a new coding technique that could correct up to 8,232 consecutive errors. This was way above any other number that could be posted by Hamming or Gilbert. The paper titled, Polynomial Codes over Certain Finite Fields showed just how this feat was possible. While mathematically this technique is rather involved and complicated, it is really quite easy conceptually. The Reed-Solomon code can detect errors in code the same way the human eye can find bad points in an otherwise smooth parabolic curve. This code sees bad points in the code, but can recover because it knows the general shape of the transmission. This coding technique has had a profound affect on communications because it is implemented in everything from CD's and hard drives to deep space probe communication.

1960 - Present: David K. Berlo and Communication Theory

Shannon's communication model was not only used for information theory, which is inherently mathematical, but also in communication theory that is mostly sociological. An example can be seen in David K. Berlo's book, The Process of Communication where he uses the idea of a source, encoder, channel, decoder, and receiver to describe social interaction between humans. He showed how each component determines the meaning of a message. It is the idea of meaning that separates information theory from communication theory. However, while these two fields have different foci and goals, they both stem from early ideas of how information is transmitted and received.

1961 - 1965: Fano, Gallager and the Decay of Errors

In 1961, R.M. Fano was able to determine explicitly a representation of an exponential decay coefficient for block codes that describes the relationship between code rate, block length, and probability of error. This decay coefficient is nonzero only for rates below the channel capacity. In 1965, R.G. Gallager provided a much simpler proof of Fano's results and provided his best estimate for the lower bound on the decay coefficient for low-rate codes (a proof of this coefficient is yet to be given).


1970's - Present: The New Challenge: The Internet!

With the creation of the ARPAnet in 1969 and the growth of networked computers and servers, point-to-point communication gave way to multitermnal problems. Now called the Internet, this massive, worldwide network presents many questions and problems for information theorists. In 1973, David Slepian and Jack Wolf published their paper that marked the beginning of the field of multiterminal source coding theory. These authors were able to specify completely the admissible rate region for noiseless coding of a memoryless multiterminal source; in 1975, Thomas Cover extended their results to the case of a stationary ergodic multiterminal source by means of an elegant proof.

Fig. 4 - A Nokia cell phone

Networks are important to a wide variety of fields and not just the scientific community. Businesses of all sorts depend on their internal and external networks working properly and efficiently. The largest network, the Internet, must handle millions of transmissions every day.

Because of this large growth in networks, a large growth in research has also appeared. Many universities and companies devote a lot of time and money into making sure networks run properly and efficiently. Researchers create new technologies every day such as personal digital assistants and cell phones all which are wired into huge networks and must be running at peak performance. These new technologies bring many questions and problems with them; all of which are the targets of extensive research.

Fig. 5 - A 3Com Palm III

Figures 4 and 5 are examples of just a few technologies that are the subjects of many studies and experiments as to how they can send data most efficiently. As advanced as they may seem, today's researchers are still exploring the same problems that faced Shannon and others in the 1950's.

Back to table of contents Top of page

Useful links

IEEE Information Theory Society Home Page

Lucent Technologies - 50 Years of Information Theory

Back to table of contents Top of page