Information_theory Information_theory

Information theory - Definition and Overview

Related Words: Cobol, Esp, Fortran, Acquaintance, Advice, Answer, Arraignment, Assembler, Bail, Bit, Blame

Information theory is a branch of the mathematical theory of probability and mathematical statistics, that quantifies the concept of information. It is concerned with information entropy, communication systems, data transmission and rate distortion theory, cryptography, data compression, error correction, and related topics. It is not to be confused with library and information science or information technology.

Contents

History

Claude E. Shannon (19162001) has been called "the father of information theory". His theory for the first time considered communication as a rigorously stated mathematical problem in statistics and gave communications engineers a way to determine the capacity of a communication channel in terms of the common currency of bits. The transmission part of the theory is not concerned with the meaning (semantics) of the message conveyed, though the complementary wing of information theory concerns itself with content through lossy compression of messages subject to a fidelity criterion.

These two wings of information theory are joined together and mutually justified by the information transmission theorems, or source-channel separation theorems that justify the use of bits as the universal currency for information in many contexts.

It is generally believed that the modern discipline of information theory began with the publication of Shannon's article "The Mathematical Theory of Communication" in the Bell System Technical Journal in July and October of 1948. This work drew on earlier publications by Harry Nyquist and Ralph Hartley. In the process of working out a theory of communications that could be applied by electrical engineers to design better telecommunications systems, Shannon defined a measure of entropy:

<math>H = - \sum_i p_i \log p_i \,<math>

(where pi is the probability of i) that, when applied to an information source, could determine the capacity of the channel required to transmit the source as encoded binary digits. If the logarithm in the formula is taken to base 2, then it gives a measure of entropy in bits. Shannon's measure of entropy came to be taken as a measure of the information contained in a message, as opposed to the portion of the message that is strictly determined (hence predictable) by inherent structures, like for instance redundancy in the structure of languages or the statistical properties of a language relating to the frequencies of occurrence of different letter or word pairs, triplets etc. See Markov chains.

Recently however, it has emerged that entropy was defined and used during the Second World War by Alan Turing at Bletchley Park. Turing named it 'weight of evidence' and measured it in units called bans and decibans. This is not to be confused with the weight of evidence defined by I.J. Good and described in the article statistical inference, that Turing also tackled and named "log-odds" or "lods"). Turing and Shannon collaborated during the war but it appears that they independently created the concept. (References are given in by Andrew Hodges.)

Relation with thermodynamic entropy

Entropy as defined by Shannon is closely related to entropy as defined by physicists. Boltzmann and Gibbs did considerable work on statistical thermodynamics. This work was the inspiration for adopting the term entropy in information theory. There are deep relationships between entropy in the thermodynamic and informational senses. For instance, Maxwell's demon needs information to reverse thermodynamic entropy and getting that information exactly balances out the thermodynamic gain that the demon would otherwise achieve.

Among other useful measures of information is mutual information, a measure of the dependence (alternatively measured by the correlation) between two random variables. Mutual information is defined for two events <math>X<math> and <math>Y<math> as

<math>I(X; Y) = H(X) + H(Y) - H(X, Y) = H(X) - H(X|Y) = H(Y) - H(Y|X)\,<math>

where <math>H(X, Y)<math> is the joint entropy, defined as

<math>H(X, Y) = - \sum_{x, y} p(x, y) \log p(x, y) \,<math>

and <math>H(X|Y)<math> is the conditional entropy of <math>X<math> conditioned on observing <math>Y<math>. As such, the mutual information can be intuitively considered the amount of uncertainty in <math>X<math> that is eliminated by observations of <math>Y<math> and vice versa. When the random variables in question are continuous rather than discrete, the sums can be replaced with integrals and densities used in place of probability mass functions.

Mutual information is closely related to the log-likelihood ratio test for multinomials and to Pearson's χ2 test. Also, mutual information can be expressed using Kullback-Leibler divergence:

<math>I(X; Y) = D(P(X,Y) \| P(X)P(Y)) \,<math>

Shannon information is appropriate for measuring uncertainty over an unordered space. An alternative measure of information was created by Fisher for measuring uncertainty over an ordered space. For example, Shannon information is used over the space of alphabetic letters, as letters do not have 'distances' between them. For information about the value of a continuous parameter such as a person's height, Fisher information is used, as estimated heights do have a well-defined distance.

A. N. Kolmogorov introduced an information measure that is based on the shortest algorithm that can recreate it; see Kolmogorov complexity.

Extensions in progress

In 1995, Tim Palmer signalled two unwritten assumptions about Shannon's definition of information that made it inapplicable as such to quantum mechanics:

  • The supposition that there is such a thing as an observable state (for instance the upper face a die or a coin) before the observation begins
  • The fact that knowing this state does not depend of the order in which observations are made (commutativity)

The article Conceptual inadequacy of the Shannon information in quantum measurement [1] (http://scitation.aip.org/getabs/servlet/GetabsServlet?prog=normal&id=PLRAAN000063000002022113000001&idtype=cvips&gifs=yes), published in 2001 by Anton Zellinger [2] (http://www.quantum.univie.ac.at/zeilinger/) and Caslav Brukner, synthetized and developed these remarks. The so-called Zeilinger's principle suggests that the quantization observed in QM could be bound to information quantification (one cannot observe less than one bit, and what is not observed is by definition "random").

For a tutorial on quantum information see [3] (http://members.aol.com/jmtsgibbs/infothry.htm).

Applications

Composer James Tenney, among others such as his teacher Lejaren Hiller, has used information theory in the composition of musical works such as Ergodos.

See also

References

  • Claude E. Shannon, Warren Weaver. The Mathematical Theory of Communication. Univ of Illinois Press, 1963. ISBN 0252725484
  • Thomas M. Cover, Joy A. Thomas. Elements of information theory New York: Wiley, 1991. ISBN 0471062596

External links


Example Usage of Information

EddieMill: Want some Information for the #sleepout (http://bit.ly/yhAPT) later? Fed: http://bit.ly/1Z8csm Follow @Sleepout2009 for real-time updates.
stevebeasant: Lib Dems release Information that shows Labour’s immigration legislation isn’t taken seriously by rogue employers http://tinyurl.com/yky2qnb
roggieXoxo: LAWD my essayy is not even started Information on this topic is hard to find.
Copyright 2009 WordIQ.com - Privacy Policy  :: Terms of Use  :: Contact Us  :: About Us
This article is licensed under the GNU Free Documentation License. It uses material from the this Wikipedia article.