clock menu more-arrow no yes mobile

Filed under:

Do Defensive Metrics Disagree Too Much?

Examining the greatest single-season discrepancies between some of the more the popular defensive metrics of the day.

There was significant disagreement between defensive metrics over his defensive performance in 2011.
There was significant disagreement between defensive metrics over his defensive performance in 2011.
Kim Klement-US PRESSWIRE

One of the lazier criticisms of WAR that was tossed around in last week's straw-manhunt was the fact that you can get "wildly different" WAR totals based on which version of WAR you are using. As most of us are fully aware, the use of several different WAR methods is actually preferable to a uniform calculation because it's entirely possible one of the WAR systems may be undervaluing or overvaluing a player.

Comparing the differences between the WAR calculated at Baseball-Reference, Fangraphs, and Baseball Prospectus allows us to have a more-complete, more-rounded view of a player's value from several angles, rather than constricted to one narrow and limited perspective.

For position players, the 'wild' differences stem mainly from the different replacement levels that each system has in place, but also the different methods of measurement used in determining a position player's defensive runs above average. For contemporary seasons, Baseball-Reference uses DRS, while Fangraphs uses UZR, and Baseball Prospectus its own FRAA.

Plenty has been written before regarding the frequency in which these metrics will disagree with one another-- even BTBS wunderkind Glenn DuPaul wrote about it with regards to the 2011 and 2012 seasons. But I don't ever recall seeing a full list of the greatest discrepancies between the major public defensive evaluators since their inception in 2002.

So, in order to get an idea of just how divergent these metrics can be when evaluating the same player in the same season, let's first compare the greatest deltas between Baseball-Reference's RField against that of Fangraphs' UZR.

# NAME Year Fangraphs B-Ref. Diff
1 Nomar Garciaparra 2002 -10.1 15.9 26.0
2 Craig Counsell 2005 7.6 30 22.4
3 Alex Gonzalez 2010 5.1 27 21.9
4 Juan Pierre 2010 12.4 -9 21.4
5 Mark DeRosa 2008 3.2 -18 21.2
6 Austin Jackson 2011 7.9 29 21.1
7 Darin Erstad 2002 17.8 38.7 20.9
8 B.J. Upton 2010 1.4 -19 20.4
9 Mark Kotsay 2002 9.2 -10.4 19.6
10 Adrian Beltre 2002 19.6 0 19.6
11 Craig Biggio 2002 1.4 -18 19.4
12 Moises Alou 2004 18.4 -1 19.4
13 Derek Jeter 2002 0.9 -18.4 19.3
14 Ryan Braun 2010 -10.2 9 19.2
15 Alfonso Soriano 2010 5.1 -14 19.1
16 Orlando Hudson 2007 -2 17 19.0
17 Ben Zobrist 2011 10.2 29 18.8
18 Orlando Hudson 2006 -5.4 13 18.4
19 Rey Ordonez 2002 13.1 -4.8 17.9
20 Aaron Hill 2009 -4.9 13 17.9

Nomar's 2002 season is the very worst case scenario insofar as disagreement between the two defensive metrics. Fangraphs' UZR had him pegged as one of the worst defenders in baseball that year, while Baseball-Reference had him as one of the best. The difference of 26 runs is not a trivial matter either-- that roughly equates to a loss of two and a half Wins.

Also of note:

  • Interestingly, seven of the top 20 greatest discrepancies occurred in the 2002 season, the first season in which the data to create both metrics is available (as far as I know at least).
  • Not a single player-season made the top 20 from 2012.
  • Orlando Hudson makes the list in two consecutive seasons, once in 2006 and again in 2007, with UZR valuing him significantly below Baseball-Reference in both instances.
  • Alfonso Soriano had the greatest discrepancy in 2012 (16.8 runs).
  • Darin Erstad's 2002 total of 38.7 defensive runs above average ranks as the best defensive season ever at Baseball-Reference.

When we add Baseball Prospectus to the mix, some of the arguments get a bit rowdier:

# NAME Year Fangraphs B-Ref. B-Pro. Diff
1 Ben Zobrist 2011 10.2 29 -5.6 34.6
2 Hank Blalock 2003 14.9 27 -6.4 33.4
3 Craig Counsell 2005 7.6 30 -2.2 32.2
4 Clint Barmes 2006 8.4 26 -5.4 31.4
5 Ken Griffey Jr. 2007 -29.1 -14 2.1 31.2
6 Chase Utley 2008 19.4 31 1 30
7 Nomar Garciaparra 2002 -10.1 15.9 16.7 26.8
8 Nomar Garciaparra 2002 -10.1 15.9 16.7 26.0
9 Jack Wilson 2005 14.4 32 6.1 25.9
10 Brian Schneider 2004 9 24 -0.9 24.9
11 Jeff Francoeur 2010 3.8 17 -7.7 24.7
12 Gary Sheffield 2005 -26 -14 -1.9 24.1
13 Jack Wilson 2009 15.1 28 3.9 24.1
14 Adrian Beltre 2008 11 27 3.3 23.7
15 Ian Kinsler 2007 -8.6 4 15 23.6

.

What leads to these measurements having such dramatically different estimates of value for the same players in the same season? That's not something I can answer entirely. I imagine you would have to have in-depth knowledge on what happens under the hood of both zone-based metrics to make sense of the discrepancies, and we know FRAA is based on play-by-play data only.

I imagine some percentage of the readership may see these differences and interpret them as evidence that defensive metrics still have a ways to go before they can be deemed reliable. And I certainly won't argue with that interpretation. But I do believe this should also reinforce exactly why we should not limit ourselves to just one universal WAR metric and why we should continue to allow ourselves more than one way of evaluating player performance in any regard.

. . .

James Gentile writes about baseball at Beyond the Box Score and The Hardball Times. You can follow him on twitter @JDGentile.