Markov Chains


I thought I’d try to explain what a Markov Chain is here.  Hopefully without too much math.  Anyone reading this, please correct me on anything I say here that isn’t right.  It’s been a while since I did this in school.


Basically, a Markov chain is a model of a process that has a certain number of “states” it can be in, and changes from state to state at discrete intervals (every dice roll, or every day, or every at-bat) with certain fixed probabilities.


For instance, consider moving around a Monopoly board (ignoring jail and doubles and Chance cards and such).  There are 36 squares on the board.  These can be the “states” – call them state 0 to state 35. 


To get from “Go” to “Baltic Avenue”, you have to roll a 3.  The chance of that happening is 1 in 18, and so the transition probability from Go (state 0) to Baltic Avenue (state 3) is .0555.  With 36 states, there are 36 times 36 = 1296 different transition probabilities.  Most of these are zero – it’s impossible to move from “Go” to “Boardwalk” in one roll. 


That’s not too hard, to write down all 1,296 probabilities.  But it gets complicated when there’s more than one dice roll.  What is the chance of transitioning from Go to Marvin Gardens (29 squares) in three rolls?  Well, you could roll 10-10-9, or you could roll 9-10-10, or you could roll 12-6-11 … and once you’ve figured out all the combinations, you could add up all the probabilities to get your answer.  But there are a lot of combinations, and this is only three rolls – what happens when you want to figure it out for 10 rolls, or 20, or 1,000? 


As it turns out, if you have a computer and the first set of 1,296 probabilities, it’s easy to figure out the probabilities for 10 rolls, just by writing the 1,296 probabilities in a matrix, and multiplying that matrix by itself 10 times (a linear algebra thing that computers can do quickly).  So if you want to know what the probability is of winding up on Park Place after exactly 87 rolls, that can actually be calculated pretty easily.


The concept extends pretty easily to baseball.  Suppose you’re wondering how many runs the Seattle Mariners would score in a game if you put together a batting order alphabetically (so Beltre bats first and [Ichiro] Suzuki bats last). 


To do that, you start by defining the states.  There are going to be a lot more than 36 of them, though.  You’ve got combinations of inning (9 possibilities), runners on base (8), how many outs (3), and who’s batting (9 batting slots).  That’s 9 times 8 times 3 times 9, or 1944 states.  Multiply that by itself, and you get almost 4 million probabilities.  (Or more -- you might need to consider the number of runs scored, too.)


Most of those probabilities are zero – you can’t go from two outs in the fifth to one out in the seventh in one plate appearance.  The others, you can calculate using reasonable methods.  If the state is “third inning, one out, runner on first and second, Ibanez at bat,” what’s the transition probability to “third inning, one out, nobody on, Lopez at bat”?  It’s pretty much just the probability of Ibanez hitting a home run, which is maybe 20/685, going by his 2005 stats.  Basically, from any state, you can only get to a couple of handfuls of other states, and the probabilities aren’t that hard to figure out. 


You do have to make a few assumptions.  For instance, suppose the chance that Ibanez hits a double is 5%.  But does the runner on first score?  If you figure the runner on first will score half the time, you can give a 2.5% chance of transitioning to “third inning, one out, runners on second and third, Lopez at bat” and the other 2.5% to “third inning, one out, runner on second, Lopez at bat.”  But the point is that you only have to worry about the single plate appearance, not the whole game.  The matrix multiplication takes care of figuring the game.


That is, the Markov Chain technique says, that (a) if you can come up with a bunch of states; and (b) if you can figure the probabilties of any possible single change of state; then (c) a good computer can tell you all kinds of stuff about what happens over a whole bunch of state changes instead of just one.


And for the lineup question, (a) we know all the states, which consist of inning/outs/bases/batter; (b) we can make reasonable assumptions, like an APBA game, of what all the single probabilities are for an at-bat; which means that (c) the computer can tell us the probabilities of what happens over a whole game.


The master of applying the Markov technique is Mark Pankin.  Mark has published numerous studies on the topic, and the Philadelphia Daily News recently ran an article on his recommendations for the Phillies batting order (including rebuttal comments from Manager Charlie Manuel).


By the way, I don’t think Mark actually uses the “4 million transition probabilities” method – I think he found a way to simplify it into something more manageable.