# Information Theory and Coding

**Topics:**Data compression, Huffman coding, Information theory

**Pages:**7 (2015 words)

**Published:**January 17, 2014

Information Theory and Coding

In communication systems, information theory, pioneered by C. E. Shannon, generally deals with mathematical formulation of the information transfer from one place to another. It is concerned with source coding and channel coding. Source coding attempts to minimize the number of bits required to represent the source output at a given level of efficiency. Channel coding, on the other hand, is used so that information can be transmitted through the channel with a specified reliability. Information theory answers the two fundamental questions in communication theory: what is the limit of data compression? (answer: entropy of the data H(X), is its compression limit) and what is the ultimate transmission rate of communication? (answer: the channel capacity C, is its rate limit) . Information theory also suggests means which systems can use for achieving these ultimate limits of communication.

Amount of Information

In general, the function of a communication system is to convey information from the transmitter to the receiver that means the job of the receiver is therefore to identify which one from the number of allowable messages was transmitted. The measure of information is related to uncertainty of events. This means commonly occurring messages convey little information, whereas rare messages will carry more information. This idea is explained by a logarithmic measure of information which is first proposed by R.V. L. Hartley. Consider a discrete source whose output x_i, with i = 1, 2, …, M, are a sequence of symbols chosen from a finite set {x_i }_(i=1)^M, the alphabet of the source. The message symbols are emitted from the source with the probability distribution p_x (x_i). So the discrete message source can be mathematically modeled as a discrete random process with a sequence of random variables taking values from the set with probability distribution p_x (x_i). Now let the source select and transmit a message x_i and let’s further assume that the receiver has correctly identified the message. Then the system has conveyed an amount of information I_i given by

I_i =I(x_i )= log_2〖1/(p_x (x_i))〗=-log_2〖p_x (x_i)〗 (5.1) The amount of information, also called self information, is a dimensionless number but by convention a unit of bit is used. As stated the message x_i with small the probability of occurrence will have larger self-information.

Average Information (Entropy)

A source is said to be a discrete memoryless source if its symbols form a sequence of independent and identically distributed random variables. The average information, denoted H(X), of a discrete memoryless source is found as the expected value of the self information I(X) H(X)=E{I(X) } =-∑_(i=1)^M▒〖p_x (x_i)log_2〖p_x (x_i)〗 〗(5.2) The average information is formally defined as the entropy of X.

Example 5.1

Consider a binary source that emits symbols 1 and 0 with probability distribution p_x (1)= p and p_x (0)= 1 – p respectively. The entropy of this binary source is given by

H(X)=-p log_2p-(1-p)log_2〖(1-p)〗

When plotted for different values of p, it shows that H(X) is maximum for p = 0.5 that is for the case of equally likely symbols.

Figure 5.1 Entropy of a binary memoryless Source

This example shows that for a source with M symbols

H(X)≤log_2 M

The equality is achieved when the distribution is uniform (equally likely symbols), that is, p_x (x_i) = 1/M for all i.

Information Rate

If the source of the messages generates messages at the rate r messages per second, then the information rate is defined to be

R = rH(X) (5.3)

which is said to be the average number of bits of information per second.

Source Coding

Source coding is the process of efficiently representing the source output with a design aim of reducing the redundancy. It can also reduce the fluctuations in the information rate from the source...

Please join StudyMode to read the full document