Kevin Liles-US PRESSWIRE
Many analysts in the sabermetric community have reservations about the complex statistics used as part of the defensive component of the wins above replacement framework. Is there a better way?
According to FanGraphs, Darwin Barney was an excellent defensive player in 2012. Barney, the everyday second baseman for the Chicago Cubs, was worth about 13 runs, or 1.3 wins above replacement, due to his defensive contributions at the pivot. This is a very good number.
According to Baseball-Reference, Darwin Barney was the most excellent defensive player in baseball in 2012, along with Brendan Ryan. In terms of wins above replacement, B-R had Barney at about three-and-a-half defensive WAR. This is a tremendous amount of value ascribed to a single player -- especially one who does not play shortstop. It can mean the difference between a good starter and an All-Star level of performance.
These two comparisons of value are just a bit different.
This is a symptom of a greater problem in sabermetric circles these days: can we trust single-season WAR values, when the defensive component of WAR is, let's just say "debatable"? For a while now, there's been a lot of discussion about how useful the defensive metrics that go into WAR are, especially on a year-by-year basis.
The most common refrain I hear is that the sample sizes used to judge a fielder's performance in a given year are simply too small to ascribe much real weight to these numbers. Nevertheless, a good (or bad) defensive season can mean a gigantic swing in wins above replacement. While the WAR metrics aren't used as catch-alls by everybody, when a player has a dramatic swing in perceived defensive value, it can show up in plenty of articles and analyses about how much value a player is worth, and how useful their performance has been / can be.
Perhaps what we need to do is *deep breath* rely less on the UZR / DRS / TZ / FRAA numbers, and more on a scouting perspective.
I can just see some of my Beyond the Box Score colleagues shaking their heads and writing me off. Next thing you know I'll be advocating for Jack Morris in the Hall of Fame.
But hear me out. There's a very real issue with the way the saber community is providing a numerical value to a player's fielding contribution, and converting it into hypothetical runs (and wins). This isn't meant to be a dig at the phenomenal work that Michael Lichtman has done with UZR, or that Baseball Info Solutions has done with DRS, or any other fielding metric. We'd still be worse off if we were going by fielding percentage.
But nevertheless, there's a very real
need #want to create a runs / wins metric for fielding performance, as we attempt to numerically identify the whole of a player's contributions. So what's the solution?
What if we were to find a reputable scout, or team of scouts, or group of people that can function almost like a team of scouts, and develop a baseline expectation for a player's defensive performance during a season? Then we might be able to use this as a balance against the vagaries of single-season defensive statistical noise.
I'll give you an example.
David Wright is an interesting third baseman. I'm not a pro scout, but when I look at David, I see a third baseman who winds up being about average, defensively, at his position. This is due to a combination of good instincts, good range, okay hands, and a questionable throwing arm. Again, if you put a gun to my head (don't, please), I would say that he's probably average at his position, maybe just a hair below-average. But in 2012, he looked a little better than usual.
Advanced stats seem to tell a different story -- well, in practice, at least. UZR has pegged Wright as a bad defender over the period of 2009-2011. But it was consistent, as David was worth just about -10 runs defensively to his team. in 2012, David was worth a positive 15 runs defensively. That's a two-and-a-half-win swing between 2011 and 2012, and a very different perception than the previous year. DRS saw something very similar: a combined -34 defensive runs saved from '09 to '11, but 16 runs in the black for 2012.
Now, I believe that there is a chance that David Wright's baseline defensive value changed dramatically from 2011 to 2012. Maybe Wright refocused his energy on taking grounders, or he bought a new mitt, or his back felt better, or his offensive outburst helped his confidence and that showed on defense. But I also believe that the small samples, both in 2012 and in previous seasons, didn't actually give us an accurate mathematical representation of his true defensive value to the team (not to mention his elusive "true talent level.")
I don't think that David Wright was a god-awful defender in 2011. And I'm not sure he was a transcendent defender in 2012. But I'm also not a scout, and I didn't watch every Mets game last season, and it might not have mattered if I did. But I think that it's equally likely that a respected scouting opinion, or something worthwhile like a robust Fans' Scouting Report, might be able to give us a good view, at least from a scouting perspective, as to how good of a fielder a player is.
And I think that maybe we should use this to help build a new defensive WAR with the advanced metrics.
Here's my hypothetical: say that Joe Respected Scout (or Fans' Scouting Report) gives Wright a set of scores similar to what's in the Fans' Scouting Report now -- something we can boil down to a run value. Might it be a good idea to take this number and create an average (or some sort of weighted average) between this and the raw UZR or DRS number?
I like to think of this not about "less stats" or "less precision" and more like regression. If smart, savvy talent evaluators think that David Wright is a scratch defender at third, but the data produces blips from year to year, maybe the data isn't right. Maybe when determining the player's overall value, even for a season, we need to examine something beyond these (really cool) advanced fielding statistics.
In truth, I'm not a savvy enough number-cruncher to come up with a statistically-sound way to do this on my own. But for now, I'm going to try and use this back-of-the-envelope methodology as a personal tool when evaluating a player's seasonal performance. Maybe it will be a complete disaster, as the opinions I seek are going to be too slanted, too out-of-date, or just too wrong. Maybe there's a better way to do this.
But what I've had to do before now, even as I learn more and more about how range factors and Good Fielding Plays and all the other things that go into "advanced defensive stats," is ignore that defensive number when thinking about WAR, because it either didn't pass a smell test, or the sample size made it an flawed assessment tool.
Sometimes the numbers tell us too little about a player's performance, even when they're meant to tell us everything. Let's see if we can get closer.