I don't need to tell readers of this fine internet publication that strikeout rates are increasing, and have been for a while. In 2008, 17.5% of plate appearances ended in a strikeout; the rate has steadily risen since then, at 18.0% in '09, then 18.5%, 18.6%, 19.8%, 19.9%, and finally 20.5% in the (still young) current season. The reason for this is tough to nail down precisely. On the Effectively Wild podcast of April 18th, host Ben Lindbergh discussed the various potential causes with an all-star panel of guests that included Brian Bannister, Rob Neyer, Harry Pavlidis, and Dr. Alan Nathan. Possibilities discussed included an expanded (and more consistent) strike zone, pitch type frequency, rising pitch velocities, and others. I’ll leave it to other, smarter people to figure out exactly what’s behind the rise; for now, I’m going to stick to exactly what’s going on with the subsequent run values (based on linear weights) that are resulting from this drastic increase.
Linear weights as a method of run estimation dates back to very early in advanced baseball analysis; it was popularized, so to speak, by Pete Palmer by way of his Batting Runs concept. Long, detailed, and absolutely-worth-your-time explanations are available both online and in print; I’d recommend the wiki page at the TangoTiger blog and/or the FanGraphs library page. Simply put, events are assigned a run value based on the average change in run expectancy (including any runs that score) associated with that type of event. Those run values can be applied to a player's batting line in order to estimate the runs created by that player in that season. Linear weight values inform the coefficients (or weights - see the connection?) found in the equation for wOBA, one of the most common advanced metrics used these days. While it can get more complicated, certainly, what I’ll be using in this article is just that straightforward.
Using my database of baseball events as provided by Retrosheet, I determined the run expectancy of all 24 base-out states for all years with complete event logs (i.e., 1974-2013) on an individual basis for each year. Using that information, I found the average change in run expectancy (and average runs scored) for each of Retrosheet’s 23 event codes for each of the 40 years in the sample. Please note that all regular-season events are in the data set, including pitcher plate appearances, which may slightly skew values.* In the spirit of recent calls for openness in data-driven research, the SQL code used to find the weights is available on github if you’d like to try this yourself, though you will almost certainly need to adjust it for your exact database structure. The repository can be found here.
*Editor's Note: John tried to run the same analysis without pitcher PAs, but apparently, computers are imperfect and it crashed after 27 hours. If anyone thinks it's vitally important, you could probably ask him to try again.
During the past forty years, in strictly absolute terms, the value of a strikeout has stayed fairly steady, with the exception of a valley of increased negative value lasting from roughly 1995 to 2005; even accounting for that, at no point was the value outside of the range of -0.25 to -0.33 runs. What’s far more interesting is the value of a strikeout relative to any other kind of out.
I compared the value of every non-generic-out event to the generic out value in each year to get the relative value of each even. I then normalized by the sum across all weights. I also subsequently found the three-year moving average (starting in 1976) to smooth out any single-year abnormalities. What I found is that the relative value of a strikeout has steadily increased, beginning at roughly 0.1 runs worse than a normal out but in 2013 reaching only 0.02 runs worse. As strikeouts have increased, the run production penalty paid by hitters for striking out has lessened (relative to other outs).
Please note that the final of the three years for the moving averages is the point used for all graphs
Looking at the same period, the relative run value of a home run has (until recently) followed the opposite trend. Using a three-year moving average again, the value of a home run steadily decreased by about a tenth of a run from 1974 to 2008, since which point its relative value has shot back up to the 1970s-era highs. These opposite trends made me think about Raul Ibanez.
2013 wasn't exactly the peak of Ibanez's career; in fact, if you check FanGraphs' WAR, he comes out at exactly replacement level. What that doesn't tell you is that he posted his highest wOBA and wRC+ since 2009, when he was worth 3.5 WAR. He accomplished this by hitting 29 home runs, fourth-most among MLB outfielders, which in turn was accomplished by taking an approach at the plate that accepted dramatically more strikeouts for additional HRs.
While his walk rate stayed in line with his career rate (8.5%), his strikeouts nearly doubled - from 15.8% to 25.8%. At the same time, for the first time in his career he hit more fly balls than grounders and increased his FB% (39.2% to 42.9%), LD% (19.4% to 21.5%), HR/FB% (15.2% to 20.7%), and HR% (4.47% to 5.85%). In the mid-00s, given the relative run values of events, this may have been a costly decision; in the modern environment, with strikeouts not being as costly and HRs being worth more, this seems to have paid off reasonably well.
So, you might be asking - how many extra home runs are needed to make up for extra strikeouts? We should be able to figure this out numerically. On an absolute scale (normalized by the sum of weights) strikeouts in 2013 were worth about -0.25 runs, while home runs were at about +1.37; the exact ratio needed to balance out was 5.34 Ks per HR. However, this doesn't take into account the average value of whatever outcome is being sacrificed to create the Ks and HRs, so the exact ratio may shift a little if you want to look at a specific hitter.
Using the same three-year average as before, the exact ratio decrease gradually from 1974 to the mid-1990s (from about 5.4 to about 5.2), at which point it dove sharply to a floor of about 4.75 for the next ten years, after which it rose just as sharply back to 1970s-era values.
As an aside, you could also calculate linear weights strictly on batted ball type (as fly balls tend to rise when following this strategy), which would help in trying to look at a comparison between increased strikeouts and increased fly balls; this comes with its own set of problems, since you're relying on the judgment of some unknown stringer for classification (and so it can vary park-to-park and even game-to-game), and I won’t be discussing that here.
Now, obviously data independence is a major issue here; everything I talked about above is a part of the dataset that created the linear weights I used, so changing amounts of strikeouts and home runs will inherently change the value assigned to those events. As I mentioned before, this also includes pitcher batting; I attempted to run a version that excluded them but ran into some database issues; if I ever get results from that subset I'll share in the comments.
Further, this isn't meant to be a wholesale strategy recommendation; the number of players who both have the ability to do this and the lack of offensive production otherwise is probably pretty small. In fact, as it turns out, Ibanez can’t even do this this; based on the ratio I found he would've needed a few more HRs just to even out. That said, I asked for suggestions of players who have started "selling out for power" on Twitter, and got a few responses. Eno Sarris of FanGraphs suggested Marlon Byrd, who looks promising but is complicated by having missed most of 2012. If you look at the two-year change from 2011 to 2013, though, his increased HR% (4.14% from 1.86%) does more than make up for the jump in K% from 16.2% to 24.9%.
THT/FanGraphs writer Brad Johnson mentioned Kelly Johnson, who saw an increase in offensive value from 2009 to 2010 when he upped his K rate from 15.6% to 22.1% and his HR rate from 2.13% to 3.87%, covering that year’s ratio with room to spare. The point is, for all the concern over increasing strikeouts, there are certain batters for whom it’s maybe not a terrible situation.
For those interested, here's a table of the three-year average ratio for all retrosheet years:
. . .
The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at retrosheet.org.
John Choiniere is mostly a researcher but occasionally a featured writer at Beyond the Box Score. You can follow him on Twitter at @johnchoiniere.