Thursday, July 31, 2014

Quining Natural Kinds

One of the greatest philosophers of the second half of the twentieth century was Willard Van Orman Quine, the final true logical positivist. Quine was a student of Carnap, and his criticism helped move Carnap into a new phase in his career. Quine was, first and foremost, a technical philosopher who made strides in logic, set theory and related fields. He believed that philosophy had no mystique, it was one of the sciences. A good philosophy of biology, he might say, was a good biologist with an interest in methodological questions. He was also a very capable philosopher of science, questioning simplistic assumptions of his predecessors and proposing more realistic views of scientific language and its problems.

I have a great deal of respect for his technical contributions, a great deal of sympathy for his more philosophical proposals, and agree that most of the problems he poses are genuine. But sometimes he's just plain wrong. For instance, Quine never really dealt with the problem of induction. He noted we make proposals that get around it, but never had anything positive to say about how we do or how we should. The standard story since Bacon, is to start with a confused concept (such as heat), then apply logical analysis and observation (really, pepper heat isn't the same thing as flame really), and then find experiments that will help exclude more pseudo-content ("We must make a more diligent inquiry into this instance; for herbs and green and moist vegetables appear to possess a latent heat so small, however, as not to be perceived by the touch in single specimens, but when they are united and confined, so that their spirit cannot exhale into the air, and they rather warm each other, their heat is at once manifested, and even flame occasionally in suitable substances."). This story is older than Bacon, Aristotle and Plato would find much familiar with it. Plato would have avoided observation, both slighted experiment. This process will supposedly, by an evolutionary process, bring clarity to your language, at least as far as possible. Plato would have said clarity to one's soul!

This story was further complicated by Hume, who realized that a state of certainty and complete was not attainable for certain important concepts. For instance, to know if a truly was a cause of b, we'd have to wait for all time to see if their constant conjugation was truly constant - and even then we'd have to worry about confounding! Hume realized that what we had - and what - was an imperfect and probabilistic knowledge. Hume's realization leads to a great deal of interesting and practical philosophy, as mentioned before.

A monkey wrench was thrown by a modern philosopher, Nelson Goodman. Goodman introduced the concepts of "grue" and "bleen". In our usual language, grue might mean "green before judgement day, blue thereafter" and bleen might mean "blue before judgement day, green thereafter". Of course, in a grue/bleen language, green would mean "grue before judgement day, bleen thereafter". This is no artificial philosophical problem. When YouTube, Netflix or Amazon offers you something you'd never want in a million years, you are running into a Goodman Problem. They divide books into categories, then try to learn from your history to divide the categories you like from those you don't like. This problem is endemic in the sciences. Take economics. Imagine a monetary theorist who wants to learn about money. Money is a complicated concept. What monetary aggregate (aggragates?) might Anna Schwartz want to look at? The problem could easily be a grue problem ("M1 is important before the legal situation changed, after you want to look at M2", etc.).

What did Quine give us? Advice? Strictures? Criticism?

No. Instead Quine chose to attack the reasonable ideas of centuries of philosophers and methodologists. He links several problems together, which is a nice thing for a philosopher to do. But does he give anything to the practicing scientist? No, instead he proposes a mystical quality called "natural kinds" - which is less helpful even than the original analysis of Goodman (Goodman said that grue was not "projectable", you couldn't learn from it, which is overstated - as the comparison to monetary theory makes clear - but at least sits the problem where it is). Quine says that people make inductions on natural concepts that are usually projectable, which is only barely true, fairly false, and extremely misleading. People don't necessarily "induct on natural concepts", as made clear in - for instance - Probably Approximately Correct by Leslie Valiant. They don't take those concepts as static and given, they'd be fine if they learned some of the things they dealt with are grue-like - not that they could do anything about it if they were! They should worry if they find out that some of the things they take for granted are more sensitive and state dependent than it seems. There is a cliche in linkbait journalism "Why Things Are Worse Than They Seem". The question isn't, "What's wrong with grue?" it's "Given that grues are so common, why do we induct so well?". This is a difficulty for Bayesian theory, but I think that Leslie Valiant goes over why it isn't for other theories of representing (partial, probabilistic) knowledge and (Hume-friendly) induction.

Quine's difficulty is with his tools. He is a logician, used to a static, perfect language. When he did metaphysics, he noted ordinary reality is likely isomorphic to some four-dimensional manifold. He didn't worry about such Hegelian mysteries as languages and truth functions that are themselves dynamic and evolving. This blinded him and led him to fail to innovate, or even understand learning theory and related areas. In addition, he did not appreciate the possibility of theory-light inductive techniques of the kind seen in learning theory.

Natural Kinds are not a natural kind, but neither is anything else. The real problem is that they are misleading and useless, and get in the way of how these problems are really solved.

Monday, July 28, 2014

What is Wrong With Wittgenstein: Or, What Is Wrong With Us?

Wittgenstein

Ludwig Wittgenstein was one of the 20th century's greatest philosophers, perhaps the greatest. his influence is so pervasive that those who consider themselves uninterested in philosophy frequently espouse them. The son of an immensely wealthy and troubled Austrian family, the Wittgensteins were raised by and expected their father to be geniuses, and several of them were. Wittgenstein originally trained to become an engineer, but started into philosophy out of an interest in Bertrand Russell (he was in England at the time). He made himself the student of Gottlieb Frege, the most innovative logician in the world at the time and a founder of a new, logic oriented philosophy. At heart, Witgenstein was a romantic pessimist, but he could not even begin to defend that world view against the precision of Frege's vision. As Wittgenstein matured philosophically, he would first develop an austere, but complete vision of Frege-Russellian logical atomism. He espoused this view in Tractus Logico-Philosophicus. This was easily the most innovative work in metaphysics for at least 100 years (50 forward and 50 back), making other apparently innovative works (such as David Lewis or Heidegger) seem like mere commentaries. Like the work of Hume, there is both positive (this is how large the world is) and negative (this is the edges of the world) readings. At the time the negative reading was poplar, but I think these days most people prefer the positive.
Ludwig


Wittgenstein later "abandoned" this philosophy, but that isn't to say that he stopped thinking that its propositions validated its conclusions. Instead, he began to feel as though he, Russell and Frege had avoided the big questions. They took a logical language as being metaphysically given. "The world is the totality of facts, not of things.", he says. But facts are defined as true propositions, but truth of a proposition is something given by a language. Based on his analysis of the ideal language, Wittgenstein comes to very severe limits on what can be expressed and known ("There is no possible way of making an inference form the existence of one situation to the existence of another, entirely different situation.", "What we cannot speak about we must pass over in silence.", etc.), without regard to non-ideal and imperfect situations. He left himself without any method of figuring out how this austere vision got into the dirtiness of the real world - at times it seems purposefully.

Probability was no help. He deliberately based probability on the ideal language already given: "The truth-functions can be ordered in series. That is the foundation of the theory of probability.".  In fact, Wittgenstein was always a subjectivist about truth and probability. "Agreement with the truth-possibilities can be expressed by coordinating with them in the schema the mark 'T' (true).", therefore there is no need for "such things as 'logical objects' or 'logical constants'...". Taking language as a given and taking priors as a given are subject to very similar criticism. It doesn't seem that Wittgenstein was ever worried about the evolution of beliefs of a speaker of an ideal language, but this idea of the foundation of probability would later be explored by Carnap and Ramsey and would help give birth to Bayesian philosophy.

"Darwin's theory has no more to do with philosophy than any other hypothesis in natural science.". This left him without resort to David Lewis/American Pragmatist arguments about the evolution of language. A Fregean ideal language would not evolve naturally - at least not ones close enough to perfect found metaphysics on, which is what Wittgenstein needed. He began another approach, one grounded in actual speech-acts and language games.

But before he did this, Wittgenstein began examining the foundations of mathematics.He had much to criticize in the standard methods, some of which precursed his later philosophy. For instance, he pointed out that despite Frege and Russell's claims that they were founding ordinary math, if their systems were found to contradict ordinary math, it is their systems that would be tossed out. Therefore, they can't really be said to be "foundations". But just as interesting was Wittgenstein's "finitism". Naive finitism is a very rare position in the philosophy of mathematics, for the same reason the denial of gravity is rare in physics and the denial of evolution is rare in biology. It denies far too much of what mathematics is about! How could Wittgenstein - an engineer, a logician and a mathematician - come to such a crazy position? The answer must be that Wittgenstein was a non-Naive Finitist. Let's figure out what that means.
G Cantor

The explicit study of infinite sets was introduced by Georg Cantor. Cantor was not working on an austere philosophical problem, but on Fourier Series - the mathematics of waves. The mathematician and physicist Fourier had made complex wavefronts amenable to simple mathematical analysis by considering them as being made up of a bunch of interacting simple waves. This theory is remarkably beautiful, it is perhaps the crown jewel of 19th century mathematics. The theory is extremely broad and wide, but also presents immediately measurable quantities to test (when applied to physics). But many statements about Fourier (and related) Series were quite difficult to answer with the primitive state of mathematics. One of the most basic questions "Is there one and only one Fourier Series for a given complex wavefront?" leads inevitably to the consideration of infinite sets. With modern mathematical machinery (Hilbert Spaces), the proof can be contained in a few lines. For Cantor, it required a staggering number of innovations.
K Weierstrass


To control the mathematics of infinite sets, mathematicians try to make them work as much like finite sets as possible without contradiction. There is a very clear history of this idea, which either began in Cauchy or was believed to begin in Cauchy by his contemporaries and simultaneously by Bolzano (explicitly). The methods then passed through Weierstrass into the wide mathematical world. This method was and is called "analysis". By the analysis (in the ordinary sense) of seemingly infinite statements, you convert them to limits of finite statements, then there is a simple, official way of turning limits into finite statements. The official way is by considering "Cauchy Sequences". Cantor was a student of Weierstrass. He thought - and probably correctly! - that his methods were elucidations of this technique. But there were difficulties, inclarities and even some contradictions in Cantor's methods, or at least in his explanations of them. Cantor fought off the contradictions with a technical trick, he said that while there may be infinite sequences of infinite sets infinitely larger than each infinite set before it, there were still somethings (now called "Proper Classes" in English) too big to be considered sets. But why? This was an arbitrary metaphysical statement made to correct an obvious contradiction, not something that was intuitively clear before the contradiction was realized. Frege arose to the challenge, he declared that he could and would find a logically clear system that would contain all the good and useful discoveries of Cantor.
G Frege

Incidentally, Frege's system contained a technical flaw that made it vulnerable to the above mentioned paradox. This flaw is so minor, it is somewhat surprising that he made it. If he did not, then mathematics could have been seriously damaged - formal systems could have been taken as a closed field instead of an open one!

Anyway, Frege's most powerful innovation was the notion of a Quantifier. Kant taught us that existence is not a predicate, but then what the hell is it? Frege figured that out. Doing so made almost all previous logics technically obsolete, and for this reason alone Frege would have and deserve a place among the philosophical pantheon. Wittgenstein realized that there was a difficulty in interpreting a quantifier, if we want to hold up the extremely fruitful analysis methods of Weierstrass and Bolzano. Wittgenstein was not a naive finitist. He could understand and believed in Cauchy Sequences. I'm going to teach you what Wittgenstein thought was wrong with Quantifiers. Let's suppose there is one quantifier, read "There exists". Then the sentence below says "There exists a p such that p has property D.":

$$\exists p. D(p)$$

This has obvious applications to mathematics and interacts beautifully with sets. For instance, we can say that "There exists p in a set S such that p has property D.":

$$\exists p \in S. D(p)$$

Now to get specific. Let's choose a set: \( S = N_n\) is the set of all natural numbers less than or equal to n (so that for instance \(N_3 = {1,2,3} \) ) and let \(D = D_n \) be the property of being the largest number in the domain (specified in the quantifier). Then for every n:

$$\exists p \in N_n. D_n(p)$$

Is read "There exists a largest number in every set of natural numbers less than or equal to n.". Since n has that property, this is always true. The method of analysis then suggests that we take a limit:

$$\lim_{n \to \infty}{\exists p \in N_n. D_n(p)}$$

And since this is true for every finite n, this statement is ... False! What madness is this? Wittgenstein was right to object. The problem is that quantifiers don't necessarily obey the rules of analysis that made the work of Cauchy, Bolzano and Weierstrass so powerful. There may be a largest number in every finite set, but not necessarily in an infinite set - indeed, there never is a largest number in an infinite set of integers! What rules were there? Many insisted on not giving rules, others (most importantly, Hilbert) used any possible rules they desired to make proofs!
A and K Goedel

Is there something wrong here? Not mathematically, as Goedel taught us in his completeness and compactness theorems that we can take these limits to mean what we choose and never come to contradiction - unless there was already one in the finite cases. Hilbert was right, if you could prove something like the (relative) consistency of one interpretation - one set of rules - you got the rest for free. This is now called "model theory". Wittgenstein objected to this, Frege did not offer us the foundations to an infinite number of differing models of arithmetic, he offered us the groundwork of ordinary arithmetic! Perhaps there is a deep philosophical problem here. If there is, we mathematicians, we Cantorians, we infinity-lovers do not and never will lose sleep over it. Wittgenstein was perhaps right to point out that standard arithmetic has this unintuitive property, and equally right to point out that quantifiers do not have the properties we might like - not if we also want standard models. But this is a philosophical (and perhaps pedagogical) problem. We have wandered our way out of the fly bottle. We don't ask to go back in for more.

Sunday, July 27, 2014

Idle Chatter

I don't feel like working on Samuelson today, having had a very difficult week of doing similar things. I'll need to do it tomorrow though, or I'll never get back to it.

What is the name of the derivative of the arctan function? That is, the pdf of the Cauchy distribution. In a problem of static superconductors, the magnetic field is distributed (in one direction) as a Cauchy distribution. This is cool and all, but is there a name for that bell-curve-esque shape? You might wonder why this would be the case. Electro and Magnetostatics can be thought of as non-varying fields, true, but they can also be thought of as asking what are the stable distributions for randomly walking electrons in a given geometry. In any other configuration the electrons would move and they'd drag field around with them. In the discrete electrostatic case, the unique solution is given by Kirchhoff's circuit laws, in the continuous case by Laplace or Poisson Equations. These equations can get very interesting, I've examined a fairly fundamental case that has given rise to cauchy distributions and chaos! The study of harmonic functions truly is a noble cause.

My least favorite philosopher of science is Paul Feyerabend, who I think was not only wrong on every level, but deeply and unsalvageably wrong. Popper is drenched in technical errors and shortcomings, but might be saved on some level (in fact, I think he has been and that level is called learning theory). Lakatos is good, but not interested in the questions I happen to be. But then there's Feyerabend!  For instance, I can't understand how he could even have opinions on the materiality of the mind, why should he reject the evidence of psychic phenomena, not to mention angels and other spirits. There are standards when he wants (in Against Method, he defends voodoo), but not when he doesn't. Feyerabend on renormalization is a hoot. Even if he understood it, he doesn't seem to ever have expected that it would be used throughout material science (or, at least, in its condensed matter underpinnings) - that most practical of beasts - and doesn't even seem to have understood what was difficult about it in the first place. Perhaps I should write something on that, I like multi-scale methods better than perturbation ones and they are not as widely reprinted. Feyerabend wrote an accidentally hilarious letter asking why physicists don't seem to be engaging with philosophers (or course, that he was personally insufferable explained why they didn't interact with him...), never seeming to realize that physicists just got over the puzzles that bothered philosophers like him. His personal judgements were completely off-base: we all remember how stuffy and overly professional Feynman was right? His lack of charity towards physicists was typical of him. He would likely claim that he was not being critical, but I regard his self-claims as nearly as worthless as those of Popper (apparently nobody ever understood Popper, apparent criticisms were all shallow misreadings, of course). Perhaps Feyerabend is better at criticism ("All these philosophies make these mistakes") than at positive construction. If so, is there any reason for a non-philosopher to take Feyerabend seriously?

A brief  thought: Everyone knows the "broken window fallacy", just because people are put to work to replace windows doesn't mean that society is richer. But stealing a window doesn't break it. People are put to work, and we still have a window! What is the welfare basis for property? Why do animals have property?

With My Mind On My Money And My Money On My Mind

Why is money valuable? Usual theories of value involve scarcity or labor (I want to be inclusive), but the dominant form of money for decades now has required no labor to make and has no real scarcity. Instead it has existed by government decree - "fiat money". So why is money valuable?
A Smith
The first theory as to the origin of money comes from Adam Smith. In his chapter "On The Origin and Use of Money", Adam Smith sketches his theory as to the value of money:

"In the rude ages of society, cattle are said to have been the common instrument of commerce; and, though they must have been a most inconvenient one, yet in old times we find things were frequently valued according to the number of cattle which had been given in exchange for them. The armour of Diomede, says Homer, cost only nine oxen; but that of Glaucus cost an hundred oxen. Salt is said to be the common instrument of commerce and exchanges in Abyssinia; a species of shells in some parts of the coast of India; dried cod at Newfoundland; tobacco in Virginia; sugar in some of our West India colonies; hides or dressed leather in some other countries; and there is at this day a village in Scotland where it is not uncommon, I am told, for a workman to carry nails instead of money to the baker's shop or the alehouse.
...
In all countries, however, men seem at last to have been determined by irresistible reasons to give the preference, for this employment, to metals above every other commodity. Metals can not only be kept with as little loss as any other commodity, scarce anything being less perishable than they are, but they can likewise, without any loss, be divided into any number of parts, as by fusion those parts can easily be reunited again; a quality which no other equally durable commodities possess, and which more than any other quality renders them fit to be the instruments of commerce and circulation. The man who wanted to buy salt, for example, and had nothing but cattle to give in exchange for it, must have been obliged to buy salt to the value of a whole ox, or a whole sheep at a time. He could seldom buy less than this, because what he was to give for it could seldom be divided without loss; and if he had a mind to buy more, he must, for the same reasons, have been obliged to buy double or triple the quantity, the value, to wit, of two or three oxen, or of two or three sheep. If, on the contrary, instead of sheep or oxen, he had metals to give in exchange for it, he could easily proportion the quantity of the metal to the precise quantity of the commodity which he had immediate occasion for."

This theory gives a special place to metals, an empirical observation that is no longer of any relevance. People traded by their own nature since time immemorial and looked around for something to act well as a medium of exchange. In Smith's time, metals did the job (for many tasks), imperfectly because they were heavy and still not infinitely divisible. Since numbers are lighter, more durable and more divisible than metals, they have so replaced them. However, this doesn't explain the value of money, just its convenience. Why can't ten thousand flowers bloom? Why is there a ruling currency in the land?
W S Jevons

After the marginal revolution, the problem became more acute. Money was, except in its value as money, fairly worthless. Even though metal was scarce, it was not very useful, even on margin, except as money (though the other uses did sometimes matter, for instance, Newton's accidental placing of England on a gold standard). Jevons and Menger, two innovators of neo-classical methods, gave a historical story of the origin of money. From Menger:

"[W]e have to explain why it is that the economic man is ready to accept a certain kind of commodity, even if he does not need it, or if his need of it is already supplied, in exchange for all the goods he has brought to market, while it is none the less what he needs that he consults in the first instance, with respect to the goods he intends to acquire in thecourse of his transactions. ... Consider how seldom it is the case, that a commodity owned by somebody is of less value in use than another commodity owned by somebody else! And for the latter just the opposite relation is the case. But how much more seldom does it happen that these two bodies meet! Think, indeed, of the peculiar difficulties obstructing the immediate barter of goods in those cases, where supply and demand do not quantitatively coincide; where, e.g., an indivisible commodity is to be exchanged for a variety of goods in the possession of different person, or indeed for such commodities as are only in demand at different times and can be supplied only by different persons! Even in the relatively simple and so often recurring case, where an economic unit, A, requires a commodity possessed by B, and B requires one possessed by C, while C wants one that is owned by A -- even here, under a rule of mere barter, the exchange of the goods in question would as a rule be of necessity left undone."

This was termed by Jevons as the "Coincidence of Wants" theory of money's origin. It is obviously false. In society before money, people would happily trade for favors and future recompense. Trade was simplified through implicit conventions, rather than enforced institutional ones. There might be inconvenience in divisibility, but this was still not solved by Menger's time - a coin was still an individual coin and a piece of eight was still a unit.
F Hahn

The origin of the value of money was put into modern terms by the economist Frank Hahn. He had in mind a Walras-Arrow-Debreau-McKenzie model of an equilibrium economy. From whence then did money come? The auctioneer has no need for "coincidence of wants", he moves everybody's goods at once. The usual solution is that the auctioneer can demand a small amount of a particular good (tax some money) from the agents, this induces a small positive demand for money, then Gresham's law takes over and drives all other goods out of the means of exchange game. This solution was precursed by monetary theorist Ludwig von Mises, but his verbiage is out of date and his conclusions included presumptions about institutions that are not well supported by data (namely, he assumed that a bureaucracy would always and only have pressure to inflate. This was arguably well borne out by experience up to but not after his time).

All of these theories are well presented here, by the brilliant Katushito Iwai. Iwai's exposition could use some Englishing up, and that was the original point of this post. But then I let life get in the way. Some other time!

Thursday, July 24, 2014

Two Camp Songs



I am slowly going crazy
1, 2,3,4,5,6 Swtich!
Slowly going crazy, am I?
6,5,4,3,2,1, Switch




Nobody likes us
Everybody hates us
Think we'll go eat worms!
Big fat juicy ones,
Eensie weensy squeensy ones,
See how they wiggle and squirm!

Wednesday, July 23, 2014

Are Non-Parametric Methods Simpler To Teach Than Parametric Ones?

There is an adage that any title that ends in a question mark can be answered "No.". However, this proposal might have some weight. I have seen students take an entire course in statistics and not be able to identify what a statistic even is, which is almost a crime. Some of the blame comes from disinterested students cramming to fill a requirement, but part of the problem is how statistics is taught. Students are not taught to see what a statistic is in its natural environment. In fact, with automated statistics programs, it would be much easier to teach non-parametric and data driven statistics first, then teach parametric statistics with regression analysis in a second course. Instead, it's instantly on to z-scores and t-scores as if just because Fisher found them first they must be the easiest!

But simplest in theory is not simplest to learn. A better method would be to emphasize probability distributions, data driven methods and non-parametric tests and using statistical software (in a stats for life, non-calculus based statistics course especially!). These are more complex theorems, but I'm talking about classes in which central limit theorem isn't proven in the current system anyway. If you aren't proving the theorems anyway it doesn't matter how difficult it is to prove them. Anyway, the most important fact in statistics - what makes statistics work at all - is that gathered data can be used to generate a probability distribution. Everything that we learn from the statistics are facts about this distribution. Again, this is not emphasized in stats for life classes, and I don't know why. In my opinion, an entire section of the course, perhaps a month, should be spent on taking data and looking at a distribution, a pdf and a cdf curve. The student should learn what a statistic is by relating them to the geometric pictures they are getting from gathered or generated data. For instance, the measures of center (means, medians, modes) should be related to the actually observed centers in data. This is supplemented with the use of statistical software - perhaps R for advanced students, Excel for less advanced students - to show how statistics are found in practice. Once the students understand that a statistic is a function of a distribution, then we can move on to tests. Several non-parametric tests, such as the Kolmogorov-Smirnov Test and cdf-based nonparametric confidence intervals are easily related to the geometry of the distributions and easily coded. This experience will teach them the students the point and practice of statistical tests more than z-scores as they are currently taught, because the way z-scores are currently taught relates them to a distribution that doesn't obviously come out of data. The students are unused to thinking about data as a distribution because they are taught that only a few distributions and only given the CLM as a heavenly cheat that means that we don't have to think about how large sets of data will be distributed to estimate the mean (which even by itself is not language they are used to!). Obviously the normal distribution frequently comes out of data asymptotically, but I've found that the students find this too many hurdles to leap at once. If we give up that useful fact and concentrate on teaching the basics - distributions, statistics and statistical tests - it will seem less magical to the students.

A personal note: I remember the first time I taught statistics the same reason that I remember the first time I drove a car that caught fire - the nightmares. However, the students did react to some things well. I had them run a roulette simulation in excel, to show that asymptotically they would lose money. Seeing the data and the trends helped them immensely, they learned a lot and enjoyed it. In retrospect, I realize I could have taught much more like this. In fact, everyone can. I could have had them make a kernel of the roulette outcomes, so that they would realize it is a pdf. I could have had them find the statistics of that distribution, or run a regression and relate the regression to the statistics. All opportunities wasted.

To summarize:
1. Statistics stands on three pillars.
2. The first pillar is that data induces a probability distribution.
2a. But in current statistical teaching practice, students are not drilled into instantly putting data into a probability distribution. Since this can be done easily with technology, they should be.
2b. This means that even though non-parametric statistics is in general harder for statisticians, this doesn't mean it is to learn for students. After all, we can control the data they use so that they see mostly well behaved data in homework.
3. The second pillar is that certain functions of those empirical distributions capture the facts about the distribution that we care about. These functions are called statistics.
3a. But in current statistical teaching practice, statistics are introduced piecemeal (even then, only the measures of center and the variance) and poorly connected to probability distributions. Once a student is used to creating probability distributions from data in the previous step, computing functionals of the distributions is easy. For instance "find the max" is equivalent to computing a mode, "find the center of gravity" is equivalent to finding the mean and "find the middle" is equivalent to finding the median.
3b. This assignment is an extension of the above - "given data draw the empirical distribution" was the previous one, this one is "given data draw the empirical distribution, print it out and make marks on certain spots" is this one. This can be supplemented with using technology to find these automatically.
4. The final pillar of statistics is statistical testing, do the statistics of the data say what we want?
4a. But in current statistical practice, statistical testing is introduced mainly in a specific case, z and t scores. This means that statistical testing is presented piecemeal, with only a few sentences said for justification. To understand this choice, one must understand the central limit theorem. Answering these questions requires the teacher to hand a distribution to you, which detaches the student from the process of going from data to geometry to conclusion.
4b. This assignment is an extension of the above - "given data draw the empirical distribution, print it out and make marks on certain spots" was the previous one, this one is "given data draw the empirical distribution, print it out and make marks on certain spots, then draw a confidence interval around those spots" is this one. This can be supplemented with using technology to find the intervals automatically.
6. Whereas the teaching of z-scores and t-scores encourages students to use those tests without justification, the teaching of non-parametric statistics in this manner will encourage students to think about empirical distribution and what statistics are important first, which will give them access to a larger toolkit. In a second course on statistics, it can be further pointed out for the statistics of greatest interest certain distributional forms can be expected, meaning z-scores and t-scores can come back into the curriculum, this time in the proper place.

Alternately, we can tell people that good science is whatever weird randomness you can get in a lab, testing be damned. After all, some prefer theft to honest toil...

Tuesday, July 22, 2014

Maxwell's Demon


In Theory of Heat, J C Maxwell - one of the greatest physicists of all time - attempted to illustrate the new theories of molecular flux and thermodynamics in a form as complete as the science allowed. He finished this section with a thought experiment that purported to show "Limitation of the Second Law of Thermodynamics". "Before I conclude, I wish to direct attention to an aspect of the molecular theory which deserves consideration". He introduced the idea we now call Maxwell's Demon, meant to illustrate the nature of Maxwell's thoughts on the Second Law. I'll let Maxwell illustrate:

"One of the best established facts in thermodynamics is that it is impossible in a system enclosed in an envelope which permits neither change of volume nor passage of heat, and in which both the temperature and the pressure are everywhere the same, to produce any inequality of temperature or of pressure without the expenditure of work. This is the second law of thermodynamics, and it is undoubtedly true as long as we can deal with bodies only in mass, and have no power of perceiving or handling the separate molecules of which they are made up. But if we conceive a being whose faculties are so sharpened that he can follow every molecule in its course, such a being, whose attributes are still as essentially finite as our own, would be able to do what is at present impossible to us. For we have seen that the molecules in a vessel full of air at uniform temperature are moving with velocities by no means uniform, though the mean velocity of any great number of them, arbitrarily selected, is almost exactly uniform. Now let us suppose that such a vessel is divided into two portions, A and B, by a division in which there is a small hole, and that a being, who can see the individual molecules, opens and closes this hole, so as to allow only the swifter molecules to pass from A to B, and only the slower ones to pass from B to A. He will thus, without expenditure of work, raise the temperature of B and lower that of A, in contradiction to the second law of thermodynamics."

The Wikipedia image is much better than the one I tried to make.

This brief thought experiment has given rise to a minor, but interesting, literature on whether Maxwell's reasoning is correct. Surprisingly, smart money says "No."! . The primary difficulty in Maxwell's thought experiment is his opinion that just because you can capture one fast particle that you can continue to capture more. In fact, this is a direct violation of the principle of detailed balance - if the door is open for any length of time, it is as likely to let a particle out as in (there are as many fast moving particles on one side of the door as the other after all!). Maxwell's reasoning is therefore circular, it assumes that if one could violate the second law, then he could. Another way of putting this is that he did not include the work done by the demon as a part of the system. If the demon is considered a rectifier or computing device, then the entropy of this device must be such that equilibrium will still be reached. This approach is demonstrated in a state of great excellence in this paper. Since the invariance of phase volume is a principle of mechanics, the circularity of reasoning described is revealed. This second approach - really the first approach in new clothing - was pioneered by Szilard and brought to a state of modernity by Landauer. In this paper, the Szilard-Landauer approach is given a simple model which is solved explicitly. They don't go into detail about the equivalence of these lines of thinking, in fact I don't know if anyone has bothered to do so. Incidentally, this literature has been tough for me to track down, even some of the most famous papers by Smoluchowski (as far as I know, Experimentally Verifiable Molecular Phenomena that Contradicts Ordinary Thermodynamics has never been translated!). Still, I can give examples of pieces of the literature. This literature is not pure theory, it also includes plentiful experimental and numerical examinations of these thought experiments. This excellent paper includes both a good summary of the issues and a formal model of the above trapdoor, showing precisely how it fails. This paper shows how well numerical experiments can clarify and elucidate, something near to my own heart.

Before I go, I should mention that I first became aware of this literature through the Feynman Lectures on Computation, which includes chapters on Quantum Computing, Reversible Computing and the physics of computation. His discussion of these issues probably influenced me a lot, but I don't want to dig it out of it's current location. Feynman also made a sizable contribution to this literature in his Lectures on Physics, where he introduced the Brownian Ratchet to illustrate the concepts above. This section is a very good example of how theory can be used to elucidate.
Much, much better than my attempts  

The two sides of the ratchet are in two boxes of gas, to the center a mass is tied. Randomly, the gas will push the blades in tank 1 (at temperature 1) left and the pawl in tank 2 (at temperature 2) stops the ratchet from moving right. This means that just like the Smoluchowski trap, the ratchet works as a rectifier. Feynman analyzes in detail why this fails - and it fails for the same reason that Maxwell's failed. There is a presumption that the pawl is not subject to the same random fluctuations, in other words that one need not worry about gas particles moving the other way. This whole chapter is worth reading (of course, the entire book is worth reading...) but I will only reprint his final words:

"If T2 were less than T1, [then] the ratchet would go forward, as anybody will believe. But what is hard to believe, at first sight, is the opposite. If T2 is greater than T1, the ratchet goes the opposite way! A dynamic ratchet with lots of heat in it runs itself backwards, because the ratchet pawl is always bouncing. If the pawl, for a moment, is on the incline somewhere, it pushes the inclined plane sideways. But it is [almost] always on an incline plane, because if it happens to lift up high enough to get past the point of a tooth, then the inclined plane slides by, and it comes down again on an inclined plane. So a hot ratchet in pawl is ideally built to go around in a direction exactly opposite to that for which it was originally designed!"

Exciting stuff!

Thursday, July 17, 2014

Long Run Relative Frequency of Classification Error

Not everything that is called "Bayes" is Bayesian! Bayesian theory is an approach to probability, one with interesting philosophical and technical consequences. There is not a unique system that is itself "Bayesianism", since different folks emphasize different things. That said, Savage and Jaynes are notable references.
But there is Bayes and there is Bayes. Some things, like Bayes Error are not Bayes (indeed, Bayes theorem will not even be used in this post). Instead, the Bayes Error is best thought of the frequency that a particular classifier will misclasify observations.


Consider two overlapping uniform probability distributions:

A simple classifier is a vertical line. This separates the plane into two parts. We will label one part green and another part blue:
The classifier mistakes some of the blue for green and some of the green for blue. This means that the error is the area:

If we let \(C_i \) be the property of being labelled the center, then the formula for the error is

\[ p(error) = \int_{-1}^0 p_g(x)dx+\int_0^1 p_b(x)dx\\ p(error) = \int_{-\infty}^\infty p_g(x \cap C_b)dx+\int_{-\infty}^\infty p_b(x \cap C_g)dx \]
Since the definition of conditional probability is \(p(A|B) = p(A \cap B) / p(B)\), the above can be written:

\[ p(error) = \int_{-\infty}^\infty p_g(x|C_b)p_g(C_b)dx+\int_{-\infty}^\infty p_b(x|C_g) p_b(C_g) dx \]

This last one is the general definition of Bayes error. One can easily see that this integral gives a line for the error in the above situation. Therefore, moving in one direction is always optimal, until one reaches the edge of the higher uniform pdf. This is only true in this trivial example, but it makes the geometric interpretation obvious. One can compute this for several pdfs and get semi-parametric bounds via Tchebysheff's inequality.

No posts for the next three days as I will be out of town.

Wednesday, July 16, 2014

Hume and Edgeworth

Or: The Consistency of The English Philosophy!
Hume has been called "one of the most important philosophers in the English language", his skepticism and empiricism inspiring - in positive and negative ways - whole swaths of philosophy starting in the 19th century and continuing to this day. Hume is best known for his attack on the connection between empirics and metaphysics. He argued that we have no way of demonstrating empirically that an event necessarily causes another event. This is of great importance, and both the conclusion and the argument are pregnant with possibilities. He would later clarify that we do have reason to act on the belief that an event causes another event, their (near) constant conjugation instills in a belief in (nearly) necessary cause. This positive solution - often ignored by philosophers - flowered into the associationist school of psychology, game theory, the Bayesian school of statistics.
Francis Edgeworth was an English economist, the author of Mathematical Psychics. In this book, he attempted to take the Jevonian gloss on Utilitarianism a firmer mathematical foundation. Modern economics, with its utility functions etc., is a descendent of this line of thought (which sprung in many places at several times) but is not intrinsically tied to it. Honestly, I haven't read Edgeworth's work in detail, but I have read a little bit of Mathematical Psychics and some of his stats papers for something I wrote on the history of statistics. In Mathematical Psychics, Edgeworth develops an ingenious device for thinking through bilateral trade now called the Edgeworth Box.

I will use this graphical method to illustrate a famous passage of Hume on co-operation. A modern thinker might think that it is obvious that a non-linguistic, behavioral definition of convention and co-operation was possible, and indeed since Lewis it has been standard to found the concept of linguistic meaning on pre-linguistic ideas of co-operation. In Hume's time, it was not so obvious. Hume had to explain "... [C]onvention is not of the nature of a promise: For even promises themselves, as we shall see afterwards, arise from human conventions.". Therefore, Hume gave this as illustration: "Two men, who pull the oars of a boat, do it by an agreement or convention, tho’ they have never given promises to each other.". This is a very important point! Not only do Lewis and Hume, and biologists following him, tell us that this is how meaning got into languages ("In like manner are languages gradually establish’d by human conventions..."), it was immediately used by Hume to explain how property got into society:

"Nor is the rule concerning the stability of possession the less deriv’d from human conventions, that it arises gradually, and acquires force by a slow progression, and by our repeated experience of the inconveniences of transgressing it."

These passages of Hume are pregnant with theory, and the modernity of the theory is sometimes surprising. Hume's theoretical stances - which are those of evolutionary game theory if I may be anachronistic - run deep. Property is, he says, some sort of evolutionary strategy, one with advantages and disadvantages. Hume, obviously, believes the advantages outweigh the disadvantages. This is a story about property that can extend beyond humanity (Hume was wrong to deny this), and it has been used - by Maynard Smith and others - to examine the phenomena of nesting in animals.

Let's analyze one of these pieces, with the more modern equipment of an Edgeworth Box. Two men, Mr Blue and Mr Green, pull the oars of a boat. They must paddle the same speed in order to avoid moving in a circle. Even if these men do not speak the same language, they can and will co-ordinate. We will ask more than Hume does explicitly here (he makes more assumptions implicitly elsewhere), we will ask that the men understand that you cannot go faster than the slower paddler (we don't assume that they know the others strength). The possible speeds they can go are a set of real numbers, setting up an axis:

That gives the following box as the range of possibilities:

Our assumption about their preferences gives them Leonteif indifference curves. For Mr Blue, his indifference curves are:

And for Mr Green:

Putting these together, we get:

Let's say they just start rowing at some speed. That means they get something like the following:

Anywhere inside the square of which that dot is the corner is better for both! A simple way to think about it is that the faster rower knows to slow down, but not slower than the slower rower, and the slower rower knows to speed up, but not faster than the faster rower. Eventually, the rowers will come to a corner on both curves. Here, neither rower can improve by himself. This is a stable situation! Here is one possible solution:

Edgeworth pointed out that this is not the only solution. In fact, there is a continuum of solutions called the "contract curve" or "core":

This is the Edgeworth analysis of Hume. Hume's correctness is not in doubt in this manner of thinking. Hopefully this shows both the depth of Hume's thinking and how it relates to modern ideas. I wouldn't mind if it helped one understand the modern ideas a little better too. There are more extensions that can be made (what happens as the quantity of rowers climbs? What if they can only imperfectly measure the others speed? What kind of equilibrium did we obtain?). Notice that Hume didn't make any explicit assumptions about the nature of their preferences, yet the Edgeworth explanation explicitly assumes convex preferences. Can non-convexity be made sense of here? What other interpretations of Hume are possible - do any of them attack the substance of this translation?

Incidentally, I had a devil of a problem making the images for this post. Matlab decided some of the lines I drew just weren't good enough for her. Awful thing it is, when I turned off the axis, the invisible axis was over the lines I actually drew. Goddamn thing. Some of these images are corrected, some not. The ones which were not may change if I come back later.

Tuesday, July 15, 2014

I've been working on a post explaining the Uncertainty Principle using induced matrix norms. It might surprise that such a simple thing (it's a very standard method of proof) to do might take time, but I want to define and illustrate every single theoretical term verbally, algebraically and geometrically, so it's a little time consuming. Uncertainty principles are a consequence of linearity. It's got me wondering if there are many more applications than are usually considered - "usually considered" meaning quantum mechanics and linear filters. I've seen real applications to game theory, which make me wonder about applications to wider economic models (in fact, just thinking about applications to things which are two player zero-sum games in disguise would make for a fun post). The invoked theorem is for a non-linear operator, which might make one hopeful. What about applications to, say Input-Output Models (since these are just LTI systems, all the theory should come over...)? GE models? How about other game theory models, such as signalling games? So much to think about!

Via Leisure of the Theory Class, a neat problem from an old novel - Typhoon by Joseph Conrad. A group of mostly Chinese and Indian laborers board a ship to return to the motherland. They store their life savings in chests aboard this ship. The chests are destroyed in a typhoon. How can you distribute the money in a way fair to each person? You know the amount of silver and can ask each person how much they stored. Here is one solution.

If I post and link much more about economics, people will think I'm an economist. Something closer to my heart. Some somewhat trivial, but instructive!, application of Morse Theory. With a pencil and paper, even one who is not intimate with the mysteries one can come to a really nice understanding of how Morse Theory implies a conserved quantity for figures in two dimensions, and getting Euler's Formula as a result is a nice plus! The important bit is understanding what he calls the "Morse Lemma", which shows that in a neighborhood around a critical point.

Monday, July 14, 2014

A Grab Bag of Thoughts About Democracy

I think people are too hard on Rousseau. The "will of all" and the "general will" are difficult doctrines that lead to confusion. And Arrow's Theorem must be contended with - saying democracy is "no more than a sum of particular wills" is evading the deep questions. How the sum is carried out matters. But this being said, there is little that makes Rousseau stranger than that of innumerable other 18th century thinkers, Locke looking for lessons of statecraft in the Garden of Eden for instance. Indeed, when one looks into the guts of how Rousseau's system was to work, he was not as married to simple averaging as it might have seemed:

"It is therefore essential, if the general will is to be able to express itself, that there should be no partial society within the State, and that each citizen should think only his own thoughts: which was indeed the sublime and unique system established by the great Lycurgus. But if there are partial societies, it is best to have as many as possible and to prevent them from being unequal, as was done by Solon, Numa and Servius. These precautions are the only ones that can guarantee that the general will shall be always enlightened, and that the people shall in no way deceive itself."

If we interpret Rousseau charitably as a precursor to modern ideas about collective cognition, he becomes much more comprehensible. There are many even today who can't wrap their minds around any collective action more complex than simple averaging. And there are many others who cannot understand, but prostrate themselves before collective actions rising out of certain market structures. These men and women don't even have the luxury of being from the 18th century to excuse their confusion.

The received Viennese view of  democracy is that it is not the details of voting that matter, that democratic control works because of Competition for Political Leadership in Schumpeter's words (incidentally, Schumpeter's wacky ideas about the importance of intellectuals is false in most societies, for instance the modern US, and are rather strange reading for the historically minded):

"It will be remembered that our chief troubles about the classical theory centered in the proposition that 'the people' hold a definite and rational opinion about every individual question and that they give effect to this opinion — in a democracy — by choosing 'representatives' who will see to it that that opinion is carried out. Thus the selection of the representatives is made secondary to the primary purpose of the democratic arrangement which is to vest the power of deciding political issues in the electorate. Suppose we reverse the roles of these two elements and make the deciding of issues by the electorate secondary to the election of the men who are to do the deciding. To put it differently, we now take the view that the role of the people is to produce a government, or else an intermediate body which in turn will produce a national executive or government. And we define: the democratic method is that institutional arrangement for arriving at political decisions in which individuals acquire the power to decide by means of a competitive struggle for the people’s vote."

This is not so far from an ordinary theory of democracy as it seems, and indeed it economizes on our assumptions. It explains why "one party state" is synonymous with tyranny. Rousseau held that politics was an "art". From this, it follows that expert artisans (statesmen) may be - and "may be" is doing a lot of work here! - necessary so that the artifice of government is beautiful. But this is not really much closer to a theory of democracy than Rousseau. It doesn't explain how government institutions loved by none - such as the NSA - remain in power, or how extremely unpopular doctrines, such as net non-neutrality, can be foisted. It doesn't explain how minority cabals - such as the pro-slavery Southern United States before the Civil War - can steal power from majorities (mostly by brinkmanship). Schumpeter would be forced to say that these situations are insufficiently democratic, and in so doing would reveal the secret orthodoxy of his doctrine. This view is also propounded by Karl Popper in a less idiosyncratic form. None of them seem to have been aware or care about the obvious connection between their ideas and the American Pragmatism philosophical movement.

A modern, Samuel Bowles, propounds the idea that liberal ideals (classical liberal, that is. Is democracy not a part of that? Even Schumpeter thought it was, but maybe he means something different than most of history...) comes from older traditions that it subverts. This is hard to understand, since liberal ideology have been part of at least English speaking since the 17th century (if we let them start with Hobbes) - and were dominant for huge stretches of that time. If they are subverting real morals, why would Bowles have first hand knowledge of them? True "recent studies show moral sentiments ... indicated that incentives that appeal to material self-interest often undermine interpersonal trust, reciprocity, fairness, and public generosity." but recent studies also show all sorts of wacky things about diets. More importantly and less sarcastically, Bowles (and Herbert Gintis, though it is well known that they are the same person) have presented studies showing that people who live in market societies actually perform well on on fairness games when not coached not to do well. It seems that liberal society - though not perhaps certain liberal ideologies - works fine - with "trust, reciprocity, [and] fairness" at least.

EDIT: The above paragraph is very sloppy written, it makes it seems that Bowles argues both for and against this thesis. I should make it clear that Bowles is presenting evidence that classical liberal society actually encourages social values. If a society puts its members in contact with a lot of strangers, it will be more stable if people are nice to strangers. I was going to focus this post entirely on non-moderns, but I remembered that Bowles paper and wanted to include something empirical. Unfortunately, I did so hastily, without checking if what I said had any resemblance to what I meant. This is sloppy and silly, so I will re-write the paragraph so as to be slightly more descriptive of what the paper is about:

A modern, Samuel Bowles, examines [note: propounds sounds too much like "defends" to me] the idea that liberal ideals (classical liberal, that is. Is democracy not a part of that? Even Schumpeter thought it was, but maybe he [that is, Schumpeter] means something different than most of history...) comes from older traditions that it subverts. This doctrine is hard [for me] to [even] understand, since liberal ideology have been part of at least English speaking since the 17th century (if we let them start with Hobbes) - and were dominant for huge stretches of that time. If they are subverting real morals, why would [its proponents, people who live at the same time as Bowles] have first hand knowledge of them? True "recent studies show moral sentiments ... indicated that incentives that appeal to material self-interest often undermine interpersonal trust, reciprocity, fairness, and public generosity." but recent studies also show all sorts of wacky things about diets [a point that Bowles does not makes because he's going to need "recent studies" if he wants to do empirically grounded moral psychology]. More importantly and less sarcastically, Bowles (and Herbert Gintis, though it is well known that they are the same person) present studies in this paper showing that people who live in market societies actually perform well on on fairness games when not coached not to do well. It seems that liberal society - though not perhaps certain liberal ideologies [such as?] - works fine - with "trust, reciprocity, [and] fairness" at least. [Are these studies good? Good compared to what? Perhaps I should have said something]

Finally, the ever brilliant Machiavelli propounded a fact of common observation that seems to be missed by intellectual opponents of democracy: that the people are wiser and more constant than leaders. No ordinary person has ever called for regulation or deregulation of electricity (that is, electricity directly, ignoring environmental considerations), since they care nothing about the industry save that electricity is cheap. Democracy is preferred, for Machiavelli, because it is harder for the entire country to fall for a fad than a particular administration, and a king's court is nothing more than a long lasting administration.

Sunday, July 13, 2014

Classical Thermodynamics From "Intuitive" Symmetries? Part 1

In my last post, I promised to talk about Paul Samuelson's paper "Conserved Energy Without Work Or Heat". Now I will do so. Even earlier I had earlier promised a post about Disney Princesses. While I have a variety of observations, I haven't yet put them together into a theme.

First, a word about our author.

P Samuelson

Paul Samuelson is one of the father's of modern economics. Even more than with more famous economists like Keynes or Friedman, one can divide economics into a pre- and post-Samuelson quite easily. Paul Samuelson's most important work was his dissertation, Foundations of Economic Analysis. In that book, he provided perhaps the first completely mathematically clear explanation of what it is economists where doing. When an economist says that, for instance, a given tax is good or bad, what he or she means is that changing something given, the rest of the economy will eventually adjust and where it will settle will be better or worse than where it is now. A good example (with applications to finance) can be found here - and in innumerable other places! Samuelson made many other advances in almost every area of economics. Much can be said about his "scientific personality". He considered himself a child genius well into his 80's. He was immensely concerned with every aspect of scientific work, empirical, theoretical, philosophical, historical and pedagogical. Unlike many economists of his stature and influence, Samuelson almost never outright dismissed an economist or an economic theory, always describing them as containing nuggets of truth - of course, truths that he himself has formally obtained. His scientific ideal was that of classical thermodynamics, where the foundations were clear, the applications enormous and the empirical validity impeccable. Some have objected to his appreciation for classical thermodynamics, but most of these complaints are ill founded. To the extent he was inspired by classical thermodynamics, classical thermodynamics is inspiring. (Aside: it is not at all correct to believe that just because pre-Samuelsonian economists mostly used geometry and post-Samuelsonian economists mostly used algebraic notation that one was less mathematical. J.C. Maxwell's Theory of Heat was written almost entirely with geometry, and this example can be multiplied.)

Now on to the paper itself. This paper has multiple goals. One is a slick derivation of deep aspects of physics (esp. thermodynamics) from a few qualitative empirical regularities. One is to present the argument that one could have - in an alternate universe - completely missed the deep connections of thermodynamics to Newton's Laws. I am going to play loose with language, sometimes I will say "amount of heat" and other imperfectly defined things in order to bring these ideas closer to everyday experience. If this confuses anyone, then I will do another version which I am more careful (or they can read Samuelson's original).

(Aside 2: This idea, in a loose philosophical way, might be connected to "microfoundations" debate economists sometimes talk about. If microfoundations can be seen to be like statistical mechanics, then macroeconomics is like thermodynamics. This paper would then be an example where a simple empirical regularity is all that is needed to establish deep economic laws, rather than investigation of the deepest parts of the consumer's psyche. This argument is unimpressive, and Samuelson would never have dreamed of making it, but you can think about it if you like.)

Now, on to the deep parts of the paper. The main empirical principle of this paper is that if you put two hot things in contact, then they will equilibrate. A cold drink will turn warm in the hot summer afternoon. This principle is purely qualitative, but quantitative measures will fall out of it. Notation will simplify things. Our Principle is that "The Temperature of System 1 and the Temperature of System Two go to the Equilibrium Temperature in System 1 and the Equilibrium Temperature in System 2.". This is a mite cumbersome. Instead we say \( (t_1 ; t_2) \rightarrow (t_eq ; t_eq) \). We assume that \(t_eq\) is a function of the initial conditions.

Brief considerations as to measurability of heat are given, but these are minor enough that the assumption that the only important conclusion is that the final temperature is to be a function of the initial temperatures. This paper is written as an alternate history past Carnot, so we aren't interested in chemical, gravitational, electromagnetic, etc forces (more on aside 2: one of the many arguments against the above aside is that one couldn't do this without "microfounding" heat. Is this true? Discuss.). We consider the effects of heat by itself.

A thought experiment. Consider a bowl of hot soup kept in contact with a container of cool apple sauce. The bowl and the container are strong, they do not melt or flex because of the heat. They are kept in a insulating lunch bag, so that they don't lose heat to the environment. A sort of drawing of this situation:
What will happen? By our principle founded on common observation, the substances will come to be the same temperature. Of course, there is more to consider than just the temperature. I drew the above as if the soup and apple sauce were in equal volumes, but if I had enormously more soup or enormously more apple sauce, then the one with enormously more volume would barely notice the change due to the other. There might be other dependencies, but Samuelson follows Carnot wisdom that the main action can be captured considering only the interaction of volume and heat - and sometimes heat alone! Symbolically: $$(t_1,v_1 ; t_2,v_2) \rightarrow (t_eq,v_1; t_eq,v_2)$$ $$t_eq = f(t_1,v_1 ; t_2,v_2)$$ The first part is read "The Temperature and Volume of System 1 and the Temperature and Volume of System Two go to the Equilibrium Temperature and Original Volume of  System 1 and the Equilibrium Temperature and Original Volume in System 2.". Obviously, this doesn't depend on how the apple sauce and the soup are oriented, since the bag is being thrown around all day anyway. This implies that \(f(t_1,v_1 ; t_2,v_2)= f(t_2,v_2; t_1,v_1 )\). Today, I will concentrate solely on what can be concluded from experiments of this type alone, but in Part 2 I will introduce a more complex experiment involving pistons.

These experiments can be compounded arbitrarily. For instance, we can put four substances together:
And the above reasoning still applies. The arrangement can be manipulated so that red and yellow equilibriate while orange and burgandy equilibriate, then those two are placed next to each other and the whole system is allowed to equilibriate. Alternately, the arrangement can be manipulated so that red and orange equilibriate while yellow and burgandy equilibriate, then those two are placed next to each other and the whole system is allowed to equilibriate. Either way, the system comes to the same temperature. It is easy to arrange the volumes (actually, specific volumes) to be the same. In this case we have the simple symbolic expression for the above: \(f(f(t_1,t_2);f(t_3,t_4))=f(f(t_1,t_3);f(t_2,t_4))\). It is read "The equilibrium temperature of the equilibrium temperature of the first two substances touching the second two substances is equal to the equilibrium temperature of the equilibrium temperature of the first and third substances touching the second and fourth two substance.". You can start to see why we invented this notation!

Now we add a couple new assumptions. These assumptions are unobtrusive and intuitive, but they might be wrong and must be stated. We have already extensively discussed the existence and symmetry of the function \(f \), which takes the system set up and gives the equilibrium temperature. We also assume that the function \(f \) has the property that if two substances have the same temperature, then the equilibrium temperature is that temperature. (more on aside 2: if we were interested in "microfoundations" right now, then this would be a statement about the nature of an equilibrium - namely that it is an equilibrium!) Otherwise, purely mental divisions could make physical changes. Finally, we assume that if you perform the same experiment, but you make one of the substances hotter before hand, then you always get a hotter equilibrium. It is these properties of heat, perhaps, that lead to the idea that heat was an independent substance!

These assumptions give a remarkable conclusion, the existence of a sort of energy function! Perhaps a better name would be a caloric function (this does not mean, of course, that there is any such physical substance!). A brief verbal argument can be made. I said before that the above that the latter assumptions make heat seem like a substance, since if you add more of it, more comes. The amount of that substance is the caloric. The equilibrium temperature is derived by averaging the amount of caloric in both substances (notice averaging, not summing. This becomes clearer when the mathematical argument is fully expounded). What is remarkable is the role of symmetry in the proof. If the function were not symmetric, we'd have no reason to think that the equilibrium temperature could be found by averaging equilibrium temperatures of partitions. It was this fact that is the foundation of this theory! By choosing references, one can convert this equation into a standard internal energy function. We have begun the battle of recovering classical thermodynamics from simple symmetry arguments.

Phew! I have covered much ground and only barely scratched the surface of this paper! I need to go through and show how this caloric function is found, give more examples, and there's a an entire second experiment! These posts will come every Sunday. If there's particular aspects of this argument that interest you, drop me a comment. Before I go, however, there are a couple things I would like to highlight. First of all, notice that we derived an energy function, but I didn't check anything about its form. For instance, I didn't make any assurances that it was always positive. It is this reason I prefer to call it a caloric function, even if I risk misinterpretation that this somehow vindicates the physical concept of caloric. More importantly, I want to highlight that this argument makes no reference to Newton's or Schrodinger's laws. Physicists have a deep rooted appreciation for the laws of thermodynamics, and it is arguments like these that gives substance to those feelings. No matter what the universe is like, the laws of thermodynamics will apply as long as those fundamental symmetries are observed. This does not establish that the laws of thermodynamics are universal, but a non-physicist might wonder why they are believed to be and arguments like this will help the intuition in that regard.

Finally, a social observation. The laws of thermodynamics are wildly misused in bad science, bad philosophy and even bad politics. I have seen irresponsible writers on the internet imply and argue that they make action on global warming impossible, often with naive importations into economics or politics. What I hope is that when reading this, you absorb some of the actual intuition of this science, rather than the slogans such people use. When someone claims an application of thermodynamics, check first to see if even first principles - such as these - apply. If it is not obvious how, then they are not obviously right. See you next week!

Friday, July 11, 2014

Thursday, July 10, 2014

Can I get LaTeX to Work In Blogger

Linky.

Einstein's most famous equation is \( E=mc^2 \).

Edit: Rats! Round 2:
Linky.

Is $$\LaTeX$$ up yet? It is!
A random variable is some list \(p_i\), that is always non-negative and sums to 1. Each number i represents a possible state. The \( p_i \) is then the probability of being in that state. Statistical Entropy is defined from the following equation:
 \[ S = -k_B \sum_i p_i \log(p_i) \]
Where we assume \( 0*log(0) = 0 \). This is a measure of how random, how uniform a random variable - which is nothing but an indexed list of the probability of every possible outcome - is. It is a special measure, it is the only one that has certain desirable traits. A bachelor pad consists of a variety of locations for dishes. At first all the dishes are clean and put away. In fact, if all the dishes are in one location, then since \( p_{that location} = 1 \) and all other locations \(p_i = 0 \), by our assumption and the fact that \(1log(1) = 0\), S=0. All the dishes can be found trivially, they are all right there. As time passes and our bachelor fails to clean, dishes start traveling around the apartment. The probability of finding a dish in a strange location starts to rise. The entropy increases. I stress that this is not a metaphor, that in this model of cleanliness there really is an entropy and it really does increase. We might expect as time goes on for the dishes to be moved around the whole of the apartment (and entropy to be therefore maximized), but this part of the example is metaphor. Unless the man is, even by bachelor standards, lazy and dirty entropy could halt before being maximized.
This is the plain meaning of statistical entropy. Unless special care is taken, a random variable (in the above example, the man himself) will tend to spread things out. That spreading is the amount of entropy in that variable. This would be important enough even if it was a purely statistical object. One could measure, for instance, inequality in a way that has desirable properties, for instance, it can be decomposed into contributions from each region, so that one could search for where inequality really is in an unequal country (in the farms? in the cities? etc.). There is no connection between this index and thermodynamics. Statistical entropy is its own beast.

Statistical entropy is named after a physical property, the entropy of a system.

In the splashes of a wave, one can see a lot of movement that doesn't really go anywhere. Physcial entropy is a measure of how much energy goes into this. The concept of physical entropy was discovered in the context of studying steam engines. It was realized by an engineer/physicist named Sadi Carnot that in real machines there is always lost work. He developed a model of a machine that did not have heat loss and this is the most efficient possible machine (simple "proof": if it was not, power the more efficient machine with a Carnot machine and you can move heat in a cycle but extract work, making perpetual motion, which is impossible).

The remarkable thing about statistical and physical entropy is that they turn out to be the same thing, or more correctly, physical entropy can be well modeled as an application of statistical entropy. There is confusion here, since this is by far the most important application of statistical entropy many believe that these concepts are the same, but again statistical entropy can and is used in contexts without any thermodynamic meaning whatsoever. In physical thermodynamics, we have every reason to expect the entropy of a system to increase over time - but in the non-physical applications of statistical entropy this must be proven all over again and only if true. If the above income inequality index was subject to maximum entropy, then total income equality would never be more than a breath away. If the man's apartment was subject, he would be sleeping on knives and forks. These concepts are different and must be differentiated. That they are related makes this all the more important.

Relatedly, Paul Samuelson wrote a remarkable article on thermodynamics that I currently do not have time to post on. Maybe next time!
"I had a little friend, but he don't move no more."
Well, I had a little blog, but I hadn't posted in four years. I'm more successful now, but not much older, wiser or richer. I am now what I wanted to be then, or close to it. It wasn't hard to keep up a blog. There's plenty to say, plenty to link. Nobody cares if you bore 'em a little, with modern attention spans we don't know the difference between bored and excited. The truth is, I just stopped caring and moved on.

I'll be posting everyday, at least for a little while, and for a generous definition of "posting". If you want the future news, know that I am planning posts about Disney princesses, posts about various books I've read (a lot of them!), TV shows I've watched, things I've experienced, and of course numerous posts about things I want to pretend to read, watch and experience (spot the difference and win a prize!).

It's bad taste to write a post that is nothing but promises (and threats) of more posts. I'll try and write something of interest ...

...

Something...

I've recently purchased a wonderful book by Leonard Savage and Lester Dubins called How To Gamble If You Must. In a very short summary, it's about optimal strategies for playing various lotteries, called "casinos". If a casino is unfair, it's often best to play as boldly as they let you, to try and get out of the law of averages. This kind of thinking has lead to much more work in optimal planning under uncertainty that I'm pretty ignorant of. However, seeing as how Savage is a famous Bayesian, I knew something would bother me and it did. That thing is "finite additivity". What this means is that Savage allows for a wider range of probability assignments than we do. A typical argument allegedly in its favor is this: If you pick a random number 1 through infinity, and I tried to guess what you picked, then it supposedly makes sense to believe some sort of flat distribution over the integers. This is impossible with the kinds of probability I would grant you, but not with the looser rules that Savage and Dubins play by. Needless to say, I find this wild. For those strong in math here is a good explanation why. If you can't follow it, but for some reason find this interesting (and the debate is actually much more technical than even that post), ask and I will write a full post on my thoughts about the foundations of probability theory.

Space Dandy is one of the few Japanese cartoons I've seen that looks like it was just fun to work on. The general director of the work obviously let his artist, animators, etc. express themselves, but in a way that makes sense as a general work (more controlled than, say, Ralph Bakshi) - and a comedy! Definitely check if you like brilliant, tightly controlled childishness and chaos. Episode 7, the race episode, is probably the best introduction to the series so far.

I've had a few stray thoughts and observations on economics recently. Instead of keeping them to myself, I'll bother you with them.

A good deal of modern capital is intellectual property. If we taxed patents, trademarks, etc. wouldn't that get rid of patent trolls? Wouldn't it encourage business to keep the laws sane, to keep their taxes down? These laws exist to give out rents, so why would the taxes devolve on the consumer? Wouldn't it be much easier and more textbook to control the level of reward rent through a tax than by limiting the length of the IP? Couldn't we get rid of this? Is there any downside to this idea

Sun Tzu clearly understood the concept of iceberg transport costs. Section 2, part 15: "Hence a wise general makes a point of foraging on the enemy. One cartload of the enemy's provisions is equivalent to twenty of one's own, and likewise a single picul of his provender is equivalent to twenty from one's own store.". This makes a rate of .05. Sun Tzu's other thinking about transport costs in war is also in line with this model. Is economic geography an important part of military economics? Particularly the concept of "power projection" seems like it could easily succumb to an iceberg model.

To Be Continued! See You Soon!