/cdn.vox-cdn.com/uploads/chorus_image/image/28864403/173057673.0.jpg)
Stolen base prevalence has waxed and waned over time. Baseball-Reference data shows the incidence of steal attempts from 1950-2013:
This chart shows the number of times teams either stole a base or were caught stealing as a percentage of opportunities, which B-R defines as "plate appearances through which a runner was on first or second with the next base open." In 2013 teams attempted to steal a base 5.5% of the time, a very significant drop from what appeared to be a brief stolen base renaissance in 2011-2012.
Smarter people than me can explain why the threshold of around a 70 percent success rate is necessary to justify stolen base attempts. "The Book" by Tom Tango, Mitchel Lichtman and Andrew Dolphin (available on Kindle or iTunes) provides explanation (Chapter 11: Running Wild) but the math isn't nearly as important as the logic. Tango's Run Expectancy Matrix shows two possibilities for a runner on first and zero outs (all numbers taken from the second table on the page):
Runner on | 0 Outs | 1 Out | Difference |
---|---|---|---|
1st | 44.1% | 17.2% | -26.9% |
2nd | 63.7% | ||
Difference | 19.6% |
With a runner on first, stealing second increases the odds of scoring a run by 19.6%. If he's thrown out, the odds of scoring a run decrease by 26.9%. This is entirely logical--a person can't steal a base when he's not on base, and more outs make it more difficult to score. Leaving aside all the factors that go into stealing a base (score of game, inning, pitcher, pitcher handedness, whether it's Paul Konerko or Michael Bourn on base, catcher, and more), it's a very clear distinction--the cost of being caught stealing exceeds the benefit of stealing.
With this background, I wanted to measure how often a stolen base directly led to a run scoring. My criteria:
- A stolen base was necessary if it had to occur in order for a run to score. For example, assume Jacoby Ellsbury gets on base, steals second and is driven in on a single by Derek Jeter and no other runs score in the inning. It is highly unlikely Ellsbury would have scored without stealing second.
- Using the same example, if Jeter hit a double, I would argue Ellsbury's stolen base was not necessary. This is subjective in that runners on first score on doubles only around 50% of the time (see table), but I'd argue the steal was not necessary. Besides, it's my criteria.
- If Jeter or anyone else like Carlos Beltran or Mark Teixeira hit a home run, or if runs score after Ellsbury scores, I would argue the stolen base was not necessary--he would've been driven in regardless of whether he stole the base or not.
I understand this is entirely retrospective, done with 100 percent hindsight. I need to make clear I'm not debating base stealing as an effective strategy as much as measuring how often it works. Argue amongst yourselves whether my criteria is fair, but it's what I chose. Here's how the data looks for the 2009-2013 seasons:
Using 2009 to explain, there were 2,540 steals of second, of which 929 of those steals eventually scored. 581 of those (the "Nec," or necessary column) directly led to the only run scored in the inning (or the last--my criteria only states no further runs are scored in the inning). 22.9 percent of the steals of second were necessary to score using my criteria.
Steals of third are different since it's easier to bring the runner home with a hit, sacrifice fly, or productive out, depending on the number of outs in the inning, but the numbers don't increase dramatically. I show the steal attempts of home for informational purposes, noting attempts fell dramatically in 2013 for reasons I can't explain.
When the numbers suggest around 80% of stolen bases don't directly lead to a run being scored, the tendency is to paint with a broad brush and state "never steal," which is not even remotely close to what I'm suggesting. Down one run in the ninth with a right-handed pitcher on the mound and Rajai Davis on first with zero outs, it's a pretty safe bet what's going to happen. Turn that Tiger into Victor Martinez and the discussion is entirely different, probably even moot.
Stolen base attempts often run countercyclical to offense--when offense increases, stolen base attempts decrease. It's too early to definitively state if the decrease in offensive production since 2008 is a blip or a long-term trend, but stolen base attempts have not risen with this corresponding decrease. There could be numerous reasons for this, but one of them might be the understanding that the cost of an out made in an attempted steal is greater than the potential benefit of a stolen base, particularly when paired with the understanding of just how infrequent a steal was truly necessary.
The proper use of Big Data in baseball isn't identifying macro trends and blithely dismissing strategies out of hand, but instead using it to influence the 5-10 games a year that come down to a game decision and seeing if the data can provide a competitive advantage or novel insight. With five teams making the playoffs in each league, the importance of even one game takes on additional significance, and it makes sense to see if time-honored strategies like base stealing actually translate into runs, and by extension wins.
...
Data adapted from play-by-play data at BBaseball-Reference.com and the Tangotiger Run Expectancy Matrix
Scott Lindholm is a web columnist for 670 The Score in Chicago. Follow him on Twitter at @ScottLindholm.