Filed under:

# Re-reinventing the wheel

A little while ago, I wrote about a more sophisticated method of evaluating in-game strategy.  In a nutshell, I want to move beyond using the well-known expected runs matrix (XRM) as a method of decision making.  Simply determining the number of runs a team is likely to score doesn't paint the whole picture.  Sure, in the first inning, you want as many runs as possible.  But in baseball's endgame, you don't always need a lot of runs.  Sometimes you just need one or two runs to put a game away.  In other words, Win Expectancy is the variable you want to maximize, not XRM.

In a previous article, I showed (by way of example) that using Win Expectancy as a utility function can lead to situations where a strategy that decreases XRM can actually increase WinExp.  These are, of course, isolated, high-leverage game situations, but the lesson is that sometimes it makes sense to give up an out to advance a runner.  (Somewhere, Ozzie Guillen and  Phil Garner are hyperventilating.)

This may sound like heresy to BtB readers given the SABR-heavy bent of the site, but the lesson is an important one: the use of numbers to analyze baseball will sometimes give unexpected results.  And sometimes, chicken soup really is good for the flu.

Anyway, that's enough prelude.  I'm going to use this little toy to analyze a something that has bothered me for a long time: the runner on first in the bottom of the ninth.  Nothing is more aggravating than watching the tying or winning run sitting on first base while the closer K's the next two batters and gets the last one on a popout.  Well, actually, it is more aggravating when the runner is nailed trying to steal second base.

Performance analysts often say that the breakeven stolen base ratio is somewhere between 70%-80%.  But as Gary Huckabay said a few weeks ago:

The "break even" point is something of a fallacy anyway. Unless you're dealing with the specific probabilities of victory in a given game, it's a waste of time. If someone stole 22 bases in 30 attempts, you don't know squat about the actual value of that unless you know the specific instances of each attempt, and each action's impact on P(Win). It's a meaningless and futile exercise when looking at an aggregate stat line.... You can control when you attempt a steal, as opposed to HR, doubles, singles, etc, which you can't control.

Huckababy, I couldn't have said it better myself.

I used the method described above and in my prior article to look at this situation the way Gary would want me to.  Here's the situation: runner on first, bottom of the ninth.  You're the manager - do you let the runner attempt to steal or not?  In other words, what is the breakeven stolen base success ratio for the runner in this specific situation.  Here's an example calculation:

Inning   Outs    Base situation   Deficit    WinExp
Initial situation:
9         0       On first         1 run      31.5%
Outcome 1: Stolen base (50% probability)
9         0       On second        1 run      41.2%
Outcome 2: Gunned down (50% probability)
9         1       Empty            1 run      10.4%

Expected Value:   25.8%

Data are from the walkoffbalk.com and baseballprospectus.com.  The expected value is the probabilistic weighted sum of the WinExp of the two outcomes.  In this example, a 50% stolen base success rate results in a net decrease in WinExp (-5.7%).  If, however, the stolen base is a success 68.5% of the time, you will find that the WinExp before and after the stolen base attempt will be the same.  That's the breakeven point.

The point of this exercise is to investigate what the breakeven stolen base success rate is in various permutations of this situation.  I repeated this calculation several times, and the results are presented in my favorite form (graphical) below:

What have we learned?  The fewer outs you have, the less it makes sense to risk swiping second base, since all three data sets increase monotonically with the number of outs.  You will sometimes see managers give a runner the green light with two outs, but how often do you see it in the ninth inning?  I don't remember seeing it very often - and if you do see it, it's the wrong decision.

If you're down by two runs, you better be damn sure that the runner gets a good jump, since you need at an 87% success rate just to break even, even with nobody out.  If there are two outs, the breakeven stolen base success rate is an astounding 98%!  At the other extreme, the breakeven point is only 60% when there is a tie game and nobody out.  This is well below the gross-sabermetric-average breakeven rate of 70%-80%.

The results are surprising for the "Down by 1 run" data set; the marginal rise in the breakeven point is actually quite low across the  number of outs.  One can explain the situation for one out by considering the possibility of a double play ending the game.  In a tie game, the double play is not as bad as you might think, since the home team still has the opportunity to bat again in extra innings.  The trend for two outs is harder to explain.  Honestly, I expected the two-out situations to converge to a very high breakeven point as it did with the tie game and down by two scenarios.  My explanation is that moving the runner to second is far more important when down by one run because a single - the most common type of hit - will often score the runner.  When down by two runs, plating one run is not as important and protecting a tie game is important because the home team always has the opportunity to bat again.

As I said in my previous article, there are limitations to this approach, based on right/left splits, quality of pitcher/batter, etc.  As with all statistical reasoning, there is a delicate balance between the specificity of the situation and the sample size.  But conceptually, I believe the ideas here are the future of in-game decision making.  One modification I plan to make is to analyze scenarios by considering consecutive events.  This will help take into account the quality of batter, since it is easy to imagine that the decision to send the runner will be different when Lance Berkman is at the plate as compared to Brad Ausmus.  It's something that destroys the generality of the situation but increases the accuracy of the results.  As it stands however, I think we have a greater understanding of how to evaluate the stolen base breakeven rate in the context of a specific situation.

If there's interest, and you can let me know in the comments, I'll start developing a program that will allow anybody to do this type of analysis without wrestling with Excel.

Please be sure to read my ranting and raving in the post directly below this one.