clock menu more-arrow no yes mobile

Filed under:

Reliever and starter reliability in the playoffs

A further defense of giving the ball to a reliever in the most important moments.

Anthony Gruppuso-USA TODAY Sports

It's October! It's the playoffs! Say what you like about the fairness of the one-game Wild Card playoff, but no one can deny the excitement it generates. The new format has also created a minor cottage industry out of manager criticism, since many of them don't seem to treat a single-elimination game with the panic it deserves.

It's become a late-September tradition for Dave Cameron at FanGraphs to positively implore certain managers to be more aggressive with their bullpen, with this year's edition focusing on the Astros. The argument is probably familiar to you, but a quick review: starters are generally better than relievers, but they get worse each subsequent time facing the same batter. By the second or third time through the order, fresh relievers will almost certainly perform better than the starter, and with multiple days of rest before and after the game, there's not much cost to doing so. If the starter is on short rest, or not very good, the hook should be even shorter, perhaps only allowing two or three innings. A super progressive manager might even start with a reliever, say a righty specialist, before putting in his starter in the second or third, throwing the other manager's lineup plans into disarray, but that's probably too far afield to happen anytime soon.

On Tuesday, Yankees manager Joe Girardi came pretty close to what Dave asked for, pulling Tanaka after five innings and two runs, trying to preemptively avoid disaster where most managers would've only reacted after the game had been broken open. It worked (in that it kept the game close, if not actually winning the game), and the logical argument behind the strategy seems to make a lot of sense. Shockingly, some internet commenters disagree. "Nish" wrote the following on Dave's article:

One issue with this strategy is that there is a higher chance of a [reliever] "not having it" and melting down and costing his team the game. Assuming that each pitcher has say a 10% chance of just being completely awful, bringing in more pitchers increases the likelihood that one of the pitchers melts down...

Now, the math works out, but it relies on a pretty big assumption, namely, that a pitcher who throws five or six innings has the same chance of "melting down" as a pitcher who throws one or two. Intuitively, I don't think that's right; I think "meltdown likelihood" is probably a function of batters faced rather than appearances. We don't stop at intuition at Beyond the Box Score, though, so I turned to the data.

I set the window at the last five years of the playoffs, and pulled all pitching performances. The obvious way of structuring this would be to look at performance in long appearances and short appearances, but that falls headlong into a major issue: pitchers who are doing a terrible job don't get left in the game, especially during the playoffs, and therefore wouldn't make the "long" group. Instead of looking at the innings pitched, therefore, I looked at whether the pitcher was starting or coming on in relief. This should be at least a decent proxy for intent, and whether he was supposed to throw a short or long number of innings, regardless of what he ended up doing.

IP RA9 BB/9 K/9 HR/9
Reliever 1046.3 3.32 3.20 9.69 0.82
Starter 1998.7 4.07 2.76 7.89 0.93

Relievers have been significantly better than starters in the last five years of the playoffs, without much doubt. They give up more walks, but fewer home runs, and strike out almost two additional batters per nine innings, all of which correlates to their substantially lower runs allowed per nine. This is not shocking; throwing one inning is easier than throwing four or more.

The commenter's point was about consistency, however, not just skill, so I coded each appearance as a "success" or "disaster." I made two versions, one for RA9 and one for FIP, and if the appearance was among the highest 20% in the sample, they were a disaster. For pitchers who didn't record any outs, I coded them as a disaster if they gave up a run for RA9, or walk or HR for FIP. Rigorous? Nope! Good enough? Probably!

Appearances FIP Disaster Rate RA9 Disaster Rate
Reliever 1017 16% 18%
Starter 354 35% 25%
Total 1371 21% 20%

The story here fits with the information in the above table; relievers in the playoffs were very bad less often than starters. If I understand the commenter's argument correctly, it asserts the stats when looking at all relievers are misleading, because there are excellent relievers and very bad ones, and the excellent ones bring the mean performance down and hides the high number of disaster performances. This suggests that simply isn't true; the distribution of reliever skill is pretty similar to that of starter skill, but relievers are better.

But! Perhaps there is one last potential savior for this idea. The first table also showed relievers were generally better than starters in the playoffs; maybe we shouldn't be comparing these pitchers to each other, but to themselves, i.e., their RA9 and FIP from the regular season. This should ignore the fact that relievers seem to be generally better than starters. If, in his appearance, a pitcher had an RA9 50% worse than his regular-season RA9, it was coded as a disaster. Same standard for FIP, and I kept the same rules as above for pitchers who didn't record an out.

Appearances FIP Disaster Rate V2 RA9 Disaster Rate V2
Reliever 1017 22% 21%
Starter 354 37% 34%
Total 1371 26% 24%

More evidence that, if anything, relievers can more reliably be expected to perform near their seasonal averages than starters. Three different ways of looking for any indication that relievers are more inconsistent than starters, or that five one-inning appearances are more likely to yield runs than one five-inning appearance, and nothing found.

Much of this is probably a function of the playoffs. Memory is an unreliable tool, and when thinking of relievers, it's often easy to recall a catastrophic appearance, and the numerous boring innings of two groundball outs and a strikeout are forgotten. That one reliever is made to stand in for all, but the relievers that pitch in the playoffs are much better than the general regular season pool. With more days of rest between games, managers can concentrate their innings among their best relievers to a greater degree, some of whom were probably starters in the regular season and not available for the late innings.

The other reason I think this argument resonates with so many people? The same reason I couldn't use performance by innings; when pitchers implode, it is usually in a short appearance! The problem is, that doesn't imply the converse: short appearances aren't therefore more likely to result in pitcher implosion. Is it possible it's true, and I'm just not looking in the right way? Sure. Is it likely? Looking at the above tables, no. Someday, someone will go full bullpen, with lots of short appearances, and it seems like it'll be a good idea.

. . .

Henry Druschel is a Contributor at Beyond the Box Score. You can follow him on Twitter at @henrydruschel.