/cdn.vox-cdn.com/uploads/chorus_image/image/28145239/20130228_kkt_sv7_272.0.jpg)
It's an argument that's been broached many times before (here's a compelling example written by Joe Posnanski): even with the rise of the closer, teams' winning percentages in close and late games haven't changed much. This chart shows teams' save percentage in ninth-inning "save situations" -- that is, up 1-3 runs when the inning starts. Note that if you calculate teams' winning percentage, the graph will change slightly, since a blown lead can be recovered in extra innings. But we can see that, after an initial rise following World War II, save percentage in these situations levels off from the start of the expansion era onward.

That's not to say the closer is entirely useless. We all know the game has become more difficult over time as hitters have improved as the pool of available players has grown. One could interpret this chart as an argument for closers: the increase in specialization (e.g., closers) lets teams bring in fresh, live arms against increasingly talented batters, keeping the percentage of close and late games won more or less stable.
Two weeks ago, I looked for evidence of the closer mentality by observing closers in non-save situations, but I found that we couldn't find a reliable difference in their performance between save situations and non-save situations. The natural follow-up is to look at "non-closers" in save situations: Can we observe other relievers performing below expectations when the game is on the line? Contrary to expectations, I found these non-closers underperformed in save situations relative to their performance in other games, although there are some obvious selection bias issues.
The most difficult part of this entire study is defining who is and is not a closer. For purposes of this article, we define closers as anyone with more than 30 saves in a season, and non-closers as anyone with fewer than 10 saves in a season; for simplicity, we ignore anyone with more than 10 but fewer than 30 saves. There are obvious selection biases plaguing both groups, of course. The closer group is entirely made up of those pitchers who performed well in save situations. At the same time, any would-be closers who underperformed or were injured get lumped in (unfairly) with the non-closer category.
Given that caveat, it's probably unsurprising that our closer group wins a higher percentage of games than our non-closer group. In this section, we looked at all games from 2010 through 2013 where a new closer or non-closer started the ninth inning with his team leading by one to three runs. This table shows the winning percentage and number of games pitched by year for each group. We observe a significant difference between the two groups, as confirmed by a two-proportion z-test.
Year | Closer Apps | Closer WP | Non-Closer Apps | Non-Closer WP |
---|---|---|---|---|
2010 | 453 | 0.938 | 142 | 0.866 |
2011 | 663 | 0.931 | 153 | 0.837 |
2012 | 537 | 0.927 | 193 | 0.839 |
2013 | 693 | 0.925 | 112 | 0.866 |
This effect is independent of team and opponent strength. The following table shows the expected winning percentage for the teams employing the closers and the non-closers, as calculated from the expected Pythagenpat winning percentages for each game.
Year | Closer Apps | Exp Closer WP | Non-Closer Apps | Exp Non-Closer WP |
---|---|---|---|---|
2010 | 453 | 0.509 | 142 | 0.481 |
2011 | 663 | 0.508 | 153 | 0.503 |
2012 | 537 | 0.521 | 193 | 0.507 |
2013 | 693 | 0.503 | 112 | 0.512 |
The closers also performed better than the non-closers in terms of OBP against. We can adjust for pitcher and batter quality using the logged-odds ratio in a binary logistic regression. Using this regression, we produced the following table, which gives the expected OBP for an average batter and average pitcher for each of the four seasons in the study.
Year | League OBP | Avg Batter OBP | Avg Pitcher OBP | Non-Closer OBP | Closer OBP |
---|---|---|---|---|---|
2010 | .325 | .328 | .289 | .246 | .214 |
2011 | .321 | .323 | .287 | .244 | .213 |
2012 | .319 | .321 | .283 | .242 | .211 |
2013 | .318 | .319 | .282 | .240 | .209 |
At first blush, a large difference is expected not only because of the selection bias but also because of the difference in pitcher quality: The closer is ordinarily the best reliever on a team, and thus likely to outperform his bullpen counterparts. But the binary logistic regression allows us to control for pitcher quality, and the resulting difference of 30-40 points of OBP is higher than I personally feel comfortable writing off as solely caused by selection bias.
As a final step, we compare the performance of our non-closers in these late-game save situations to their performance in other situations. Here too, we expect the group's performance to be slightly worse in save situations because of those relievers who started the year as the closer but lost the job. And once again, there is the same 30-40 point gap between the non-closers' performance in save situations and the same pitchers' performance in all other situations, even once accounting for differences in batter and pitcher quality.
Year | League OBP | Average Batter | Average Pitcher | Non-Save OBP | Save OBP |
---|---|---|---|---|---|
2010 | .325 | .328 | .300 | .224 | .260 |
2011 | .321 | .322 | .293 | .218 | .254 |
2012 | .319 | .321 | .314 | .232 | .269 |
2013 | .318 | .320 | .307 | .227 | .265 |
Because of the issues with selection bias, these findings are not in and of themselves evidence of a "closer mentality" absent in non-closers. However, I believe these results worth sharing because I expected the difference between these groups to be small at best, even in this "worst-case" grouping. I certainly didn't expect the appreciable gap between non-closers in close and late game and those same pitchers in other situations: Why would it be more stressful to pitch in front of 30,000 people up one run in the eighth than in front of 30,000 people up one run in the ninth?
I'd love to hear other potential definitions of "non-closers" that would allow me to capture only those pitchers forced into a rare save opportunity by a quirk of scheduling or bullpen usage; I'd love to look for these effects in a less suspect group of non-closers. And I feel confident there are enough of these emergency relief appearances each season to support a meaningful analysis.
. . .
All statistics courtesy of Retrosheet and Baseball-Reference.
Bryan Cole is a newly-minted Featured Writer at Beyond the Box Score. You can follow him on Twitter at @Doctor_Bryan.