Feb 212004

Five years ago, after the 1999 season, a fellow fantasy league baseball owner and I fell into an argument about Roger Clemens. Clemens was 37 years old. In 1998 he had a brilliant season with Toronto, winning the pitching triple crown — ERA, wins, and strikeouts — and his fifth Cy Young Award. In 1999, his first year with the Yankees, he slipped considerably, finishing 14-10 with an ERA higher than league average for the only time since his rookie season. His walks and hits were up, his strikeouts were down, and my friend was sure he was washed. He argued that Clemens had thrown a tremendous number of innings, that old pitchers rarely rebound from a bad season, and that loss of control, in particular, is a sign of decline. I argued that Clemens is a classic power pitcher, a type that tends to hold up very well, that his strikeout ratio was still very high, that his walks weren’t up all that much, and that his diminished effectiveness was largely traceable to giving up more hits, which is mostly luck.

Of course Clemens rebounded vigorously in 2000 and won yet another Cy Young in 2001. He turned out not be finished by a long shot, and still isn’t. Does this mean I won the argument? It does not. Had Clemens hurt his arm in 2000 and retired, would my friend have won the argument? He would not.

Chamberlain wasn’t wrong about “peace in our time” in 1938 because the history books tell us Hitler overran Europe anyway. He was wrong because his judgment of Hitler’s character, based on the available information in 1938, was foolish; because, to put it in probabilistic terms, he assigned a high probability to an event — Hitler settling for Czechloslovakia — that was in reality close to an engineering zero. He would still have been wrong if Hitler had decided to postpone the war for several years or not to fight it at all.

“Time will tell who’s right” is a staple of the barroom pedant. Of course it will do no such thing: time is deaf, blind, and especially, mute. Yet it is given voice on blogs all the time; here’s Richard Bennett in Radley Balko’s comments section: “Regarding the Iraq War, your position was what it was and history will be the judge.” It’s not an especially egregious instance, just one I happened to notice.

Now you can take this too far. If your best-laid predictions consistently fail to materialize, perhaps your analyses are not so shrewd as you think they are. You might just be missing something. Or not. But this should be an opportunity for reflection, not for keeping score.

We fumble in the twilight, arguing about an uncertain future with incomplete knowledge. Arguments over the future are simply differences over what Bayesian probability to assign the event. There is a respectable opposing school, frequentism, which holds that Bayesian probability does not exist, and that it makes no sense to speak of probabilities of unique events; but it has lost ground steadily for the last fifty years, and if it is right then most of us spend a great deal of time talking about nothing at all. Like Lord Keynes, one of the earliest of the Bayesian theorists, we are all Bayesians now.

This, for argument, is good news and bad news. The good news is that history won’t prove your opponent out. The bad news is that it won’t prove you out either. You thrash your differences out now or not at all. Then how do you know who won the argument? You don’t. Argument scores like gymnastics or diving, not football. It will never, for this reason, be a very popular American indoor sport.

  16 Responses to “Time Won’t Tell”

  1. Nice post Aaron, very Emersonian, in fact!

    As in:

    "God delights to isolate us every day, and hide from us the past and the future. We would look about us, but with grand politeness he draws down before us an impenetrable screen of purest sky, and another behind us of purest sky. `You will not remember,’ he seems to say, `and you will not expect.’ All good conversation, manners, and action, come from a spontaneity which forgets usages, and makes the moment great."

    Would you agree with that passage (if it could be purged of its’ "instinctual" connotations, that is…)?

    And of course you were right about Clemens–as Bill James wrote, in just about every issue of the Baseball Abstract, always bet on a power pitcher!


  2. No. Living in the present is as fine as far as it goes, but to claim that all good action comes of disregarding the future is a typical piece of Emersonian fatuity.

    Sure I was right about Clemens. Just not because he’s still pitching.

  3. Just great Aaron! I’d like to see either school of thought try to put a number on what godofthemachine is going to come up with next. To me, it’s all fortuna rota with, as you say, no winners of the argument. If Trot Nixon goes back on Jeter’s line drive instead of up a step and when he recovers, turns the right way instead of the wrong way and makes a play on a very catchable line drive, do they go on to hold that three run lead and is Grady Little still manager which could have allowed the A’s to keep Keith Foulke? In the world series of poker the last two years the wheel of fortune just ran over the expert players. I’m sure some terrible ‘all in’ or call or fold was made or not made that started with some trival distraction hours earlier.

  4. Yes, Aaron, you were right about Clemens. How do we know? Because he went on to pitch well. Time, at least in this case, did tell. If he had had a few more declining seasons and then retired, time would have shown that your friend was right. If Clemens had hurt his arm-or been run over by a truck-then time wouldnt have settled the argument.

    Time often, but not necessarily, tells. To use your own example, Chamberlain was wrong about Hitlers intentions. How do we know? Because Hitler didnt stop with Czechloslovakia. If Hitler had been run over by a truck, we could still argue about his intentions, but since unfortunately he wasnt, we know for sure.

    The Bayesian argument is not about whether to assign a probability to unique future events, since EVERY event is in some sense unique, but rather what weight, if any, to give to expert opinion. Everyone, even frequentists (nice coinage, yours?) agrees we can assign the probability .5 to the event of the northernmost of two radium atoms decaying first, even though that event will be unique-THAT atom has never decayed in that way before and never will again. The same goes for whether a flipped coin will land heads, although there are more contingent circumstances that might effect the outcome than in the atomic decay case. A horse race is even more contingent, and an election, at least a fair one, more contingent still. In these cases I, and most others, still believe we can assign a probability, and, with less agreement, factor in expert opinion.

    The other end of the scale is the stock market. Both fundamentalists and technicians are frequentists, and both have yet to demonstrate that they have anything to say what the market will do. Give me expert opinion, like Martha Stewart, any day.

  5. great scoring analogy.

  6. The more I read your blog, the more I like your father, Aaron.

  7. Well Dad, I’m going to have to answer this now that you have your own fan club here and everything. Besides, the family that Bayes together, stays together. The term "frequentist," incidentally, has been around at least since the days of R.A. Fisher.

    I suppose I got what I deserved for trying to explain Bayesian probability in a short paragraph. It is true, of course, that every event is unique in some sense. However, certain events, like radium atoms decaying or coins being flipped, are repeatable in their significant aspects, and others, like my death, are not. They are more or less contingent, as you put it. The frequentists deny that probability has any meaning in speaking of highly contingent events. It is not simply a matter of what weight to give expert opinion.

    If you are really a Bayesian, like me, then you believe that some highly contingent event, like, say, Iraq’s becoming a functioning democracy in the next decade, can be theoretically assigned a probability. Naturally Bayesians disagree over how to assign it, but my question remains: if you assign a 10% probability to such an event, and it comes to pass, how does that prove you wrong?

  8. You’re right, there is no way that future events can prove or falsify an assigned probability of other than 0 or 1. When was the last time you saw a weather report predicting a 100% chance of rain?

  9. Bayesian probability isn’t really about "expert opinions" as such. The Bayesian view is that a statement of probability is a statement about our confidence in a proposition, given the information at our disposal. That information may include "expert opinions", as well as a host of other things, including, notably, the our own prior estimate of the probability before any evidence was introduced.

    The frequentist view, on the other hand, is that a statement of probability is a statement about the expectation value of a random variable. That is, if I repeat an experiment many times, the fraction of times a certain outcome appears will converge to a definite value.

    To see the difference between the two views, consider some particular statements of probability:

    1) The probability that a particle will decay within time T is 1/2. The frequentist says, "If I have a bunch of these particles, after time T (about) half of them will have decayed." The Bayesian says, "For any one of these particles, my confidence that it will decay after time T is 1/2. Therefore, if I have a whole bunch of them, I can expect half of them to have decayed after time T." In this case the Bayesian and frequentist don’t have much to disagree about, and the whole argument just seems like so much hair-splitting.

    2) The mass of the particle is x +/- y. The Bayesian’s statement reads much the same as before: "My confidence that the mass of the particle is between x-y and x+y is 95% (or whatever confidence interval he is using)." The particle has one and only one mass; moreover, in an ensemble of identical particles the particles will all have identical masses, so the frequentist is forced to resort to an ensemble of measurements: "If I make a bunch of measurements of the mass of this particle, 95% of them will be between x-y and x+y."

    This view is basically servicable, but it suffers from a few defects. First, if the measurement is unrepeatable, then one has to appeal to a hypothetical ensemble of measurements that could have been made, but happened not to be. Second, when we quote an error bar, what we really want is something to tell us how good the measurements are; however, this may have little to do with the distribution of repeated measurements. Outliers can occur far more frequently than the error bar says they should, and crude equipment can produce measurements that are too consistent and hence useless for estimating uncertainty (this latter is the bane of first-year physics students everywhere). The Bayesian view explicitly names the probability as a statement of confidence, which is how we were thinking about it all along.

    3) I will die before the end of the year. Here the frequentist has to throw up his hands in despair. His only recourse is to some sort of ensemble of worlds, in some of which I die, and others I do not, and even that is not very convincing. The Bayesian, on the other hand, is on firmer ground because given whatever evidence he has about me he is quite capable of forming an opinion about the probability of my death. Exactly how that evidence is to be distilled into a number may be a matter of some debate, but at least it makes sense to talk about a number coming out of the process; whereas, the frequentist first has to convince me that the thing to which he is trying to assign a number even exists at all.

    Aaron’s view that "Time won’t tell" whether a Bayesian probability is right or wrong is a little strange, but I think it is correct, at least to some extent. A Bayesian probability is fundamentally a statement of opinion, and opinions are never right or wrong; they are merely convincing or unconvincing. Nevertheless, observed events do constitute evidence for or against a hypothesis, and a true Bayesian will alter his probabilities (that is to say, his opinions) as new evidence comes to light.

    Bayesian probabilities are, therefore, in some sense subjective. Two Bayesians with different information will assign different probabilities to the same event. And since that information includes the notorious "prior probabilities", critics of Bayesian probability theory charge, not without justification, that you can come up with any number you want, if you start with the right prior. I don’t see that as a problem, though, because on some level it reflects a truism in human behavior; people sufficiently convinced of a proposition at the beginning of an argument will not be convinced by any amount of evidence to the contrary. Bayesian probability isn’t so much about assigning numbers to propositions as it is about codifying probabilistic inference. If I believe something with confidence x, and I am presented something that by itself would give me a confidence of y, then what should my new confidence (i.e. the new probability) be after considering all of the evidence? This is a well-defined and potentially useful question to ask, even if x and y are just WAGs.


  10. rpl:

    Opinions, just like statements of fact, are always "right," "wrong" or presently impossible to determine. The only way to be "convincing" with opinion is demonstrate its rightness or wrongness (or our present inability to determine.) Probabilities are always determined in a context of relative ignorance–otherwise we’d have a probability of 1 or 0. Statements of present fact can have such a probability, but so can opinions…

  11. Jim,

    If I understand you correctly, you are saying that in many cases there is some truth, unknowable though it may be to us, against which opinions can be judged "right" or "wrong". I suppose I would concede that, but in cases where truth is truly unknowable that concept is not very useful. When it comes time to make a decision about what you believe, all you have to guide you is the evidence, which you will weigh and find either convincing or unconvincing.

    The point is, in the case of statements of probabilities about nonrepeatable events, the posterior evidence from observing the one and only occurrence of the event will never be enough to reject the original probability assessment with high confidence. In those cases the outcome of the event is almost irrelevant, and the strength (or lack thereof) of the prior evidence is all you have to go on.


  12. No, "truth" is precisely and only known reality. Nothing else qualifies. Without a class of certain knowledge, there can be no class of "probable" or "possible." I agree that "all we have is the evidence" to guide us, but that evidence provides us with a great deal about reality and has enormous consequence: moon trips, electricity, vaccines, etc., and growing…We apprehend "truth" just fine in the ordinary sense of the words.

  13. An admirable piece of sophistry. Especially the Hitler example. Chamberlin was wrong about Hitler of course. But your judgement that Hitler was bound to wage war is justified only in hindsight. It is impossible to enter the Summer-of-1939 mind from this distance, knowing what we know of what came after. Facts that confirm a view vindicated by events always appear obvious in hindsight; but they are less obvious in the uncertain present. Hitler is a case in point. It is impossible to see the 1939-Hitler today as anything but a monster-in-waiting. But few saw it that way at the time. Recall that even George "the power of facing unpleasant facts" Orwell believed into the late 30’s that Hitler was an overbearing, but essentially harmless, blowhard.

    I understand your point that history does not necessarily prove anything. But it is also true that judgements of historical persons and events are tainted by our knowledge of what came after. Churchill can claim to have been right about Hitler, in other words, but you can’t — not without acknowledging events after Sept 1939.

    Regarding Roger Clemens, the rightness of your prediction — if one wants to be at all scientific about it — cannot be judged without reference to his subsequent performance, because a theory (power pitchers last longer. in this case) is never actually proved or disproved, only increasingly certain or uncertain. Your judgment that Clemens would last is reasonable based on an interpretation of accumulated evidence, but it is not right — only statistically likely according to the theory that power pitchers last longer than other pitchers.

    If Clemens had tanked you say you would have been right anyway. Well, maybe. But you might have been wrong too. It might have been that your theory was incomplete (perhaps only power-pitchers with good change-ups last longer), but that you explained away evidence that should have told you this.

    You might, in other words, have pulled a Neville.

  14. It is true that we can only imperfectly read our way back to the summer of 1939. But if we couldn’t read our way back at all, as you seem to be saying, then history would be impossible to write. Evidence that Hitler was more than a harmless blowhard had accumulated rapidly: Mein Kampf, Kristallnacht, the Reichstag fire, and the annexation of the Sudetenland, just off the top of my head. I don’t claim that I would have been right about Hitler, only that Churchill was, and Chamberlain wasn’t not because he actually did start the war, but because the evidence pointed in that direction.

    In the same way I don’t claim I was right about Clemens because he pitched well after 1999. I claim I was right because my theory applied better to the available evidence than my friend’s theory did. My theory may be proved wrong in the future, as you say. Unfortunately I doubt we will see enough pitchers in Clemens’ class in my lifetime to find out.

  15. Let’s try not to be too literal in our reading of the blogs. The phrase "time will tell" means "we don’t have enough information right now to evaluate your position, but I’ll get back to you when we do."

    It strikes me that this a fairly reasonable way to pause certain debates when they’ve become bogged-down in overheated rhetoric, or to underscore the fact that the party who’s demonstrably lost the debate refuses to concede.

    I’ll continue to employ the phrase, if you don’t mind.

  16. I don’t mind the phrase itself, or its variants, all that much. What I mind is the distresssingly common belief that time actually will tell, or history actually will judge.

    To pause certain debates when they’ve become bogged down in overheated rhetoric I generally prefer silence.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>