The AstroStat Slog » Shannon

[Book] Elements of Information Theory

hlee — Wed, 11 Mar 2009 17:04:26 +0000

by T. Cover and J. Thomas website: http://www.elementsofinformationtheory.com/

Once, perhaps more, I mentioned this book in my post with the most celebrated paper by Shannon (see the posting). Some additional recommendation of the book has been made to answer offline inquiries. And this book always has been in my favorite book list that I like to use for teaching. So, I’m not shy with recommending this book to astronomers with modern objective perspectives and practicality. Before advancing for more praises, I must say that those admiring words do not imply that I understand every line and problem of the book. Like many fields, Information theory has grown fast since the monumental debut paper by Shannon (1948) like the speed of astronomers observation techniques. Without the contents of this book, most of which came after Shannon (1948), internet, wireless communication, compression, etc could not have been conceived. Since the notion of “entropy“, the core of information theory, is familiar to astronomers (physicists), the book would be received better among them than statisticians. This book should be read easier to astronomers than statisticians.

My reason for recommending this book is that, personally thinking, having some knowledge in information theory (in data compression and channel capacity) would help to resolve limited bandwidth in the era of massive unprecedented astronomical survey projects with satellites or ground base telescopes.

The content can be viewed from the aspect of applied probability; therefore, the basics of probability theories including distributions and uncertainties become familiar to astronomers than indulging probability textbooks.

Many of my [MADS] series are motivated by the content of this book, where I learned many practical data processing ideas and objectives (data compression, data transmission, network information theory, ergodic theory, hypothesis testing, statistical mechanic, quantum mechanics, inference, probability theory, lossless coding/decoding, convex optimization, etc) although those [MADS] postings are not visible on the slog yet (I hope I can get most of them through within several months; otherwise, someone should continue my [MADS] and introducing modern statistics to astronomers). The most commonly practiced ideas in engineering could help accelerating the data processing procedures in astronomy and turning astronomical inference processes more efficient and consistent, which have been neglected because of many other demands. Here, I’d rather defer discussing details of particular topics from the book and describing how astronomers applied them (There are quite hidden statistical jewels from ADS but not well speculated). Through [MADS], I will discuss further more, how information theory could help processing astronomical data from data collecting, pipe-lining, storing, extracting, and exploring to summarizing, modeling, estimating, inference, and prediction. Instead of discussing topics of the book, I’d like to quote interesting statements in the introductory chapter of the book to offer delicious flavors and to tempt you for reading it.

… it [information theory] has fundamental contributions to make in statistical physics (thermodynamics), computer science (Kolmogorov complexity or algorithmic complexity), statistical inference (Occam’s Razor: The simplest explanation is best), and to probability and statistics (error exponents for optimal hypothesis testing and estimation).

… information theory intersects physics (statistical mechanics), mathematics (probability theory), electrical engineering (communication theory), and computer science (algorithmic complexity).

There is a pleasing complementary relationship between algorithmic complexity and computational complexity. One can think about computational complexity (time complexity) and Kolmogorov complexity (program length or descriptive complexity) as two axes corresponding to program running time and program length. Kolmogorov complexity focuses on minimizing along the second axis, and computational complexity focuses on minimizing along the first axis. Little work has been done on the simultaneous minimzation of the two.

The concept of entropy in information theory is related to the concept of entropy in statistical mechanics.

In addition to the book’s website, googling the title will show tons of links spanning from gambling/establishing portfolio to computational complexity, in between there are statistics, probability, statistical mechanics, communication theory, data compression, etc where the order does not imply relevance or importance of the subjects. Such broad notion is discussed in the intro chapter. If you have the book in your hand, regardless of their editions, you might first want to check Fig. 1.1 “Relationship of information theory to other fields” a diagram explaining connections and similarities among these subjects.

Data analysis tools, methods, algorithms, and theories including statistics (both exploratory data analysis and inference) should follow the direction of retrieving meaningful information from observations. Sometimes, I feel that priority is lost, ship without captain, treating statistics or information science as black box without any interests of knowing what’s in there.

I don’t know how many astronomy departments offer classes for data analysis, data mining, information theory, machine learning, or statistics for graduate students. I saw none from my alma matter although it offers the famous summer school recently. The closest one I had was computational physics, focusing how to solve differential equations (stochastic differential equations were not included) and optimization (I learned the game theory, unexpected. Overall, I am still fond of what I learned from that class). I haven’t seen any astronomy graduate students in statistics classes nor in EE/CS classes related to signal processing, information theory, and data mining (some departments offer statistics classes for their own students, like the course of experimental designs for students of agriculture science). Not enough educational efforts for the new information era and big survey projects is what I feel in astronomy. Yet, I’m very happy to see some apprenticeships to cope with those new patterns in astronomical science. I only hope it grows, beyond a few small guilds. I wish they have more resources to make their works efficient as time goes.

An anecdote on entrophy

hlee — Sat, 06 Sep 2008 00:28:05 +0000

My greatest concern was what to call it. I thought of calling it “information”, but the word was overly used, so I decided to call it “uncertainty”. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, “You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.”

When, for the first time, learning laws of thermodynamics^[1], regarding entropy, I exactly felt like what von Neumann said about it, “nobody knows what entropy really is.” Only having basic physics courses many years back, I don’t know how the epistemology of this subject has been evolved but it was a very surprising moment when I confronted the book by Thomas and Cover and Shannon’s paper (see my 2nd comment from this slog post Model vs. Model for links related to these references). I’ve been wondering about what philosophy behind the story of sharing the same lexicon in different disciplines (physics, information theory, and statistics in timely order) and this quote somewhat alleviates my burden of curiosity.

Wikiquote says that the Shannon’s words were given in Energy and Information by Tribus and McIrvine in Scientific American (Vol.224, pp.178-184, yr. 1971) and I didn’t know until now that Wikiquote exists.

Wikipedia link

A lecture note of great utility

hlee — Wed, 27 Aug 2008 18:35:14 +0000

I didn’t realize this post was sitting for a month during which I almost neglected the slog. As if great books about probability and information theory for statisticians and engineers exist, I believe there are great statistical physics books for physicists. On the other hand, relatively less exist that introduce one subject to the other kind audience. In this regard, I thought the lecture note can be useful.

[arxiv:physics.data-an:0808.0012]
Lectures on Probability, Entropy, and Statistical Physics by Ariel Caticha
Abstract: These lectures deal with the problem of inductive inference, that is, the problem of reasoning under conditions of incomplete information. Is there a general method for handling uncertainty? Or, at least, are there rules that could in principle be followed by an ideally rational mind when discussing scientific matters? What makes one statement more plausible than another? How much more plausible? And then, when new information is acquired how do we change our minds? Or, to put it differently, are there rules for learning? Are there rules for processing information that are objective and consistent? Are they unique? And, come to think of it, what, after all, is information? It is clear that data contains or conveys information, but what does this precisely mean? Can information be conveyed in other ways? Is information physical? Can we measure amounts of information? Do we need to? Our goal is to develop the main tools for inductive inference–probability and entropy–from a thoroughly Bayesian point of view and to illustrate their use in physics with examples borrowed from the foundations of classical statistical physics.