I’ve gone through a phase of reading about information theory – none of the technical detail, but some of the pop science books that are out there. It’s been interesting. The one I’d recommend is James Gleicks’ The information: A history, a theory, a flood.
James Gleicks’ The Information: A history, a theory, a flood
I enjoyed Gleicks’ book a lot. It’s a fascinating overview of information theory, wandering its ways through the history, the mathematics, and connections to a range of fields. In his tour, Gleick manages to give a reasonably user-friendly introduction to Godel’s theorem, entropy, Shannon’s information theory, and an explanation of how talking drums work. I’ve pulled out some quotes on the talking drums below, because I think they’re fascinating.
Later he explores why Maxwell’s demon is impossible, because the cost of emptying memory generates heat, rebalancing what otherwise had seemed like free energy in the system.
This is a difficult field to explore in detail without getting into the mathematics; within that constraint, however, this is definitely a book I’d recommend on information theory.
In solving the enigma of the drums, Carrington found the key in a central fact about the relevant African languages. They are tonal languages, in which meaning is determined as much by rising or falling pitch contours as by distinctions between consonants or vowels …
As the spoken languages of Africa elevated tonality to a crucial role, the drum language went a difficult step further. It employed tone and only tone. It was language of a single pair of phonemes, a language composed entirely of pitch contours …
All that mattered was for the drums to sound two distinct notes, at an interval of about a major third. So in mapping the spoken language to the drum language, information was lost. The drum talk was a speech with a deficit …
A double stroke on the high-tone lip of the drum … matched the tonal pattern of the Kele word for father, sango, but naturally it could just as well be songe, the moon; koko, fowl; fele, a species of fish; or any other word of two high tones … Thus, Carrington discovered, a drummer would invariably add “a little phrase” to each short word. Songe, the moon, is rendered as songe li tange la manga–“the moon looks down on the earth.” The extra drumbeats, far from being extraneous, provide context. Every ambiguous word begins in a cloud of possible alternative interpretations; then the unwanted possibilities evaporate.
In the long run, history is the story of information becoming aware of itself …
The written word-the persistent word- was a prerequisite for conscious thought as we understand it.
Within PM [Principa Mathematica], and within any consistent logical system capable of elementary arithmetic, there must always be such accursed statements, true but unproveable. Thus Godel showed that a consistent formal system must be incomplete; no complete and consistent system can exist. The paradoxes were back, nor were they mere quirks. Now they struck at the core of the enterprise. It was, as Godel said afterward, an “amazing fact” – “that our logical intuitions (i.e., intuitions concerning such notions as: truth, concept, being, class, etc.) are self-contradictory.”
It was, as Douglas Hofstadter says, “a sudden thunderbolt from the bluest of skies,” its power arising not from the edifice it struck down but the lesson it contained about numbers, about symbolism, about encoding …
Although the fluid may be at rest and the system in thermodynamic equilibrium, the irregular motion perseveres, as long as the temperature is above absolute zero. By the same token, he showed that random thermal agitation would also affect free electrons in any electrical conductor-making noise.
… the second law is merely probabilistic. Statistically, everything tends toward maximum entropy. Yet probability is enough: enough for the second law to stand as a pillar of science. To the physicist, entropy is a measure of uncertainty about the state of a physical system: one state among all the possible states it can be in. These microstates may not be equally likely, so the physicist writes [the equation for entropy]. The the information theorist, entropy is a measure of uncertainty about a message: one message among all the possible messages that a communications source can produce. The possible messages may not be equally likely …
… the organism sucks orderliness from its surroundings. Herbivores and carnivores dine on a smorgasbord of structure; they feed on organic compounds, matter in a well-ordered state, and return it “in a very much degraded form-not entirely degraded, however, for plants can make use of it.” Plants meanwhile draw not just energy but negative entropy from sunlight. In terms of energy, the accounting can be more or less rigorously performed. In terms of order, calculations are not so simple.
The gene is not an information-carrying macromolecule. The gene is the information.
… the laws of science represent data compression in action. A theoretical physicist acts like a very clever coding algorithm.
In Szilard’s thought experiment, the demon does not incur an entropy cost when it observes or chooses a molecule. The payback comes at the moment of clearing the record, when the demon erases one observation to make room for the next. Forgetting takes work.
When information is cheap, attention becomes expensive.
Nate Silver’s The Signal and the Noise
Nate Silver is something of a celebrity in particular circles. In part, perhaps because he was one of the first to apply the same rigour that was revolutionising baseball, to another very data-rich field – election politics. Since then he’s been excoriated by a range of critics, but I think one thing he’s done reasonably well is both make predictions (accurate, and in some cases, less accurate), and then iterate by revisiting his predictions to identify how he can improve. I’ve enjoyed his reflections on particular predictions, and the 538 aggregate estimates – they’re fascinating exercises, that have driven copies in other publications.
I was interested to read The Signal and the Noise, to see what he had to say, and which topics he’d chose to cover. Unfortunately, his book felt for the most part like a set of blog posts on different forecasting areas (earthquakes! the stock market! Bayesian probability!). There is a thread running throughout, and Silver uses it to argue that many experts overestimate their abilities, make systematic mistakes, and can improve their forecasts. But for the most part it’s an overview, high-level piece, rather than something deeper.
If you already have an opinion about Bayesian vs. frequentist probability, this isn’t the book for you. But if you’re looking for an anecdote about a particular application of probability, or want a very gentle (non-mathematical) introduction to the field, this may be the one for you.
In a broader sense, the ratings agencies’ problem was in being unable or uninterested in appreciating the distinction between risk and uncertainty.
Tetlock’s conclusion was damning. The experts in his survey – regardless of their occupation, experience or subfield – had done barely any better than random chance, and they had done worse than even rudimentary statistical methods at predicting future political events.
Political news, and especially the important news that really affects the campaign, proceeds at an irregular pace. But news coverage is produced every day. Most of it is filler, packaged in the form of stories that are designed to obscure its unimportance.
… experts either aren’t very good at providing an honest description of the uncertainty in their forecasts, or they aren’t very interested in doing so. The more fundamental problem is that we have a demand for experts in our society but we don’t actually have that much of a demand for accurate forecasts.
Vlatko Vedral’s Decoding Reality
Vlatko Vedral is, according to internet sources, a professor of Physics at the University of Oxford. His book comes across online, however, as a less-than-reputable piece, something that will have a subtitle about Kabalah and the secret to health and happiness.
I found it slightly slow going. He doesn’t do brilliantly at explaining concepts in a way that can reach the non-mathematician, or including enough material that you can work through it slowly if you want to understand the equations. So while I think the question of how information theory applies to different fields (biology, physics, etc.) is a fascinating one, this book unfortunately doesn’t live up to the premise.