/cdn.vox-cdn.com/uploads/chorus_image/image/7943871/gyi0062864832.0.jpg)
All of us know by this point that it benefits us to use as many stats as we can when evaluating pitching performance. We can not simply stop at ERA, FIP, ERA+, ERA- or even WAR and then call it a day. We instead have to consider the whole spectrum of data that is available to us.
This isn't to say, however, that we've all stopped using these metrics on their own as quick and instant evaluators. It seems almost impossible not to lean heavily on one singular statistic in a post-Twitter world, where every argument is scaled-down into a rudimentary, bite-sized 140-character format. Because when consumed just one bit at a time this form of abbreviated evaluation can appear very misleading.
Generally it seems that a plurality of people prefer to use ERA+ as their quick go-to stat for pitchers (though I'm just guessing here), and I'm certainly just as guilty as any of us for doing so on occasion. Some of us have moved on to FIP and it's league adjusted, park-adjusted brethren, but today I want to focus on one of the major pitfalls we can fall into when we lean too heavily on ERA+.
This same topic was addressed over at High Heat Stats this weekend, perhaps indirectly, when Adam Darowski re-evaluated the career of the under-appreciated Rick Reuschel. In listing off the reasons why Reuschel might have been undervalued by the general baseball fan throughout the course of the right-hander's career, Adam reveals this very enlightening, very telling bit of information:
He played in front of some horrendous defenses. Phil Niekro must sympathize with him. If you think about how bad his defenses were and how many unearned runs they must have led to, you realize that the actual number of unearned runs Reuschel allowed was probably unbelievably low. He helped this by not walking many players and not allowing many home runs. Those are easy ways to be beaten and Reuschel didn’t let his opponents do it.
Reuschel's career ERA+ is a modest 114 (88 ERA-) and not quite worthy of a Hall of Famer. And it's a good bet that writers and fans alike remember this about Reuschel. But the long time Cub fares surprisingly well when it comes to WAR estimates. Why is this?
One of the great many features of WAR is that, in addition to other factors, Baseball-Reference adjusts for quality of defense. In fact, if we turn to xRA_def_pitcher, the component of WAR that attempts to quantify the number of runs for which a pitcher's defense was responsible, we find that inferior glove-handling was more damaging to just five pitchers other than Reuschel since 1900:
WORST Career Defenses, since 1900
# | Name | Debut | IP | xRA_def |
---|---|---|---|---|
1 | Phil Niekro | 1964 | 5404.0 | -109.3 |
2 | Tom Candiotti | 1983 | 2725.0 | -77.5 |
3 | Larry Dierker | 1964 | 2333.3 | -77.0 |
4 | Ned Garver | 1948 | 2477.3 | -70.4 |
5 | Wilbur Wood | 1961 | 2684.3 | -69.6 |
6 | Rick Reuschel | 1972 | 3548.3 | -69.5 |
7 | Turk Farrell | 1956 | 1705.0 | -67.8 |
8 | Pedro Ramos | 1955 | 2355.3 | -66.2 |
9 | Kenny Rogers | 1989 | 3302.7 | -61.9 |
10 | Kevin Gross | 1983 | 2487.7 | -60.4 |
How has this affected Rick Rueschel's ERA+, then? How badly is ERA+ misrepresenting the run-preventing abilities of Adam's underdog?
Well, in order to determine this, we would need to adjust the formula to include Reuschel's expected Runs Allowed if he had a league-average defense, rather than his actual Runs Allowed.
Oh man, here comes xRA9+
Because we can not presume to know how many of these theoretical runs would have been earned or not, we must first drop ERA+ in favor of RA9+. Conveniently, Reuschel's RA9+ is the same as his ERA+ at 114.
So the next step is to add in his negative xRA to his RA totals from each season.
[ [ lgRA / lgIP * 9 ] / [ (RA + xRA_def) / IP *9 ] ] *park factor = xRA9_plus
This would give Reuschel a career xRA9+ of 120, significantly higher than his ERA+ of 114. Enough to change your opinion on his Hall of Fame worthiness? Perhaps not, but it is important enough to when conducting serious inquires into the merit of Reuschel's career.
For individual seasons the difference in his ERA+ and xRA9+ (denoted as 'X-E' below) prove to be quite different at times:
Rick Reuschel season by season xRA9+
Year | IP | ERA+ | RA9+ | xRA9p+ | X-E |
---|---|---|---|---|---|
1972 | 129.0 | 127 | 131 | 138 | 11 |
1973 | 237.0 | 130 | 122 | 132 | 2 |
1974 | 240.7 | 89 | 90 | 108 | 19 |
1975 | 234.0 | 101 | 96 | 109 | 8 |
1976 | 260.0 | 110 | 107 | 114 | 4 |
1977 | 252.0 | 155 | 161 | 177 | 22 |
1978 | 242.7 | 115 | 122 | 125 | 10 |
1979 | 239.0 | 114 | 119 | 129 | 15 |
1980 | 257.0 | 114 | 111 | 121 | 7 |
1981 | 85.7 | 110 | 101 | 118 | 8 |
1981 | 70.7 | 136 | 132 | 123 | -13 |
1983 | 20.7 | 97 | 110 | 115 | 18 |
1984 | 92.3 | 74 | 79 | 80 | 6 |
1985 | 194.0 | 160 | 153 | 164 | 4 |
1986 | 215.7 | 97 | 97 | 93 | -4 |
1987 | 177.0 | 150 | 143 | 126 | -24 |
1987 | 50.0 | 92 | 87 | 86 | -6 |
1988 | 245.0 | 106 | 115 | 112 | 6 |
1989 | 208.3 | 114 | 117 | 108 | -6 |
1990 | 87.0 | 93 | 98 | 103 | 10 |
1991 | 10.7 | 80 | 89 | 86 | 6 |
But I can hear the younger generation of Beyond the Box Score's readership already, "Hey, what about me, I'm young and 19 years old and I party all the time and use questionable slang terms to describe my chicken nuggets, what do I care about Rick Reuschel?
Well, for contemporary leaders, xRA_def generates some interesting leaderboards with some names you'd likely expect:
WORST Career Defenses, since 2000
# | Name | Debut | IP | xRA_def | xRA_def /200 IP |
---|---|---|---|---|---|
1 | Zack Greinke | 2004 | 1492.0 | -47.6 | -6.4 |
2 | Ricky Nolasco | 2006 | 1113.3 | -29.9 | -5.4 |
3 | Mark Buehrle | 2000 | 2679.0 | -27.1 | -2.0 |
4 | Brian Bannister | 2006 | 667.3 | -26.0 | -7.8 |
5 | Carlos Zambrano | 2001 | 1959.0 | -25.2 | -2.6 |
6 | Chien-Ming Wang | 2005 | 765.3 | -24.4 | -6.4 |
7 | Anibal Sanchez | 2006 | 869.0 | -24.2 | -5.6 |
8 | John Danks | 2007 | 971.3 | -24.1 | -5.0 |
9 | Josh Johnson | 2005 | 916.7 | -22.1 | -4.8 |
10 | Dontrelle Willis | 2003 | 1221.7 | -21.3 | -3.5 |
11 | Luke Hochevar | 2007 | 771.0 | -20.8 | -5.4 |
12 | Brian Lawrence | 2001 | 963.0 | -20.5 | -4.3 |
13 | Jose Contreras | 2003 | 1168.0 | -20.3 | -3.5 |
14 | D.J. Carrasco | 2003 | 493.7 | -20.2 | -8.2 |
15 | Kyle Davies | 2005 | 768.0 | -19.4 | -5.0 |
16 | Gavin Floyd | 2004 | 1127.0 | -18.8 | -3.3 |
17 | Nick Blackburn | 2007 | 818.7 | -18.0 | -4.4 |
18 | Paul Maholm | 2005 | 1332.7 | -17.8 | -2.7 |
19 | Jae Weong Seo | 2002 | 606.3 | -17.4 | -5.7 |
20 | Chris Volstad | 2008 | 695.3 | -17.1 | -4.9 |
Greinke at the top of this list should surprise no one, as he's been one of the more famous E-F abusers in recent seasons, with a good chunk of that discrepancy owed to some terrible defensive support (namely a one Yuniesky Betancourt).
For his career, Greinke breaks down like this:
Name | ERA+ | RA9+ | xRA9+ |
---|---|---|---|
Zack Greinke | 114 | 115 | 123 |
A difference of almost 10 'points' between his expected Runs Allowed and his actual Earned Runs Allowed is an important omission. It certainly goes a long way in beginning to explain the chasm of talent between Greinke and Chad Billingsley:
Name | ERA+ | RA9+ | xRA9+ |
---|---|---|---|
Chad Billingsley | 110 | 110 | 112 |
Poll
I think that we sometimes assume that variations in the quality of a pitcher's defense behind him will 'cancel out' over the course of a long career like Reuschel's or even Greinke's, but that is obviously not always the case. It's comparable to expecting a pitcher's park factors to neutralize over the course of a career, which is obviously not even remotely true. We have to remember that these factors don't always stabilize. Park factors are accounted for in intermediate-level stats like ERA+ and ERA-, but defense is not. It is certainly something we ought to keep in mind when applying those metrics so definitively.
I understand that we're not all comfortable using modern defensive metrics, especially in smaller one-season samples. (I even wrote about the issue myself, just last week.) But insofar as considering a pitcher's career value, a performance evaluator like xRA9+ might be something a to consider adopting, at least in these extreme cases, especially since we're using it in the rWAR formula already (in a manner of speaking).
So, what do you think?
. . .
NOTE: I am aware of the fact that xRA_def_pitcher as it's listed at Baseball-Reference may already be park-adjusted and is actually in effect being park-adjusted again in my formula. It's not going to make a significant difference in these numbers in this article, but for future reference it would be something to consider.
Thanks to Baseball-Reference and the Lahman database.
James Gentile writes about baseball at Beyond the Box Score and The Hardball Times. You can follow him on twitter @JDGentile.