clock menu more-arrow no yes mobile

Filed under:

Gary Sheffield and the war on WAR

Is Gary Sheffield really the worst defender ever? And are the numbers reliable enough to offset his historic bat?

White Sox v Yankees Photo by Ezra Shaw/Getty Images

Hall of Fame season, and the arguing it inflicts upon the baseball world and Baseball Twitter, is over. The very long fight for Tim Raines’ induction came to a happy conclusion this year, as he finally received the call to Cooperstown.

Then all was relatively quiet, until someone brought up Omar Vizquel, and... well, nobody needs that, not so soon after the heated debates of 2017 have ended. This, instead, is a puzzle. My hope is, if we talk about this now, ahead of the pressure and anger that Hall of Fame season can bring, we might calmly solve it together.

Here it is:

By the numbers, Gary Sheffield is the 9th-most productive offensive player since 1961 – and the 25th most productive hitter ever. Ever!

By the numbers, Gary Sheffield is the least productive defender in baseball history.

In the just-completed Hall of Fame balloting, he received just 59 votes, a measly 13.3 percent, while a similarly productive hitter (Jeff Bagwell) was elected with a comfortable 86.2 percent of the vote. Our ever-increasing understanding of advanced stats and hidden contributions also lifted Raines into his rightful spot.

If you didn’t just read the breakdown above, you might peruse the career WAR or JAWS leaderboards and conclude that Sheffield’s total doesn’t seem odd at all. And I think you’d be missing something.

Our assessment of players, and the methods we use to make those assessments, are getting better and better. Our subjects, though, are seemingly getting more and more complicated.


The numbers referred to above are the FanGraphs’ offensive and defensive WAR components. If you do the simple type of sorting that, be honest, we all love to do in search of trivia, fun facts, and perspective, Sheffield was more bad in the field than Andruw Jones was good. (It’s not just FanGraphs, though; by Baseball-Reference’s WAR, Sheffield is a close second-worst to Adam Dunn.)

You know how this goes. We all acknowledge, intellectually, that it isn’t this simple, but we skip across to the final column on the right and sort for WAR. The servers whirr and crunch the offensive and defensive (and baserunning) numbers – supernaturally, really, considering how much trouble we have mixing the similarly dissimilar oil and water – and spits out a number that we have become very confident in using to assign value to a player, to his legacy in the wider history of the sport.

In its infancy, it represented a huge leap forward, and warranted a fight to convince the game’s more stubborn scribes of its value. That, um, war, is won, but the lingering tribal mentality has probably seeped into our mental processes and created a type of blind belief – a deserved pride in the thing we correctly defend that has nonetheless depressed our powers of self-evaluation.

It’s not the worst thing in the world, and it’s certainly better than judging players based on RBI, or waiting for them to eclipse a number that happens to end in two or three zeros. However, our understanding of the game is not contained within those formulas. It’s in flux, hopefully expanding more than contracting, and Hall of Fame voting is one of the ways we wrestle it in one direction or the other.

Sheffield’s career offers us the chance to grapple with fascinating issues – how concrete can defensive ratings be, how should defensive value relate to offensive value – while perhaps finding a more accurate view of a player’s historical significance.

This isn’t the beginning of a Sheffield-to-the-Hall case (unless you end up reading it that way). It is an attempt to create reasonable doubt about the story his WAR value is telling us. Indeed, it’s an attempt to create reasonable doubt on the stories that a lot of players’ WAR values are telling us. Because many of them could use a new day in the court of informed baseball opinion.

New York Yankees v Chicago White Sox Photo by Jonathan Daniel/Getty Images

Offensive production is a relative breeze to explain and quantify. Sheffield was 41 percent better than the league-average hitter, by wRC+, over the course of a career that spanned 2,576 games. He finished with more walks than strikeouts. He hit 509 home runs. He stole 253 bases.

He provided the most menacing exhibition of bat speed the world has ever seen.

There are a lot of ways to slice and dice it, but he was an on-base monster who never struck out. His power was not quite as prodigious as some of his contemporaries, and his peak was less cohesive, but his overall offensive profile ended up as one of the era’s most incredible.

His offensive pinnacle came in 1996. After his 1995 season was cut short by an injury – and his recovery was interrupted by a gunshot wound – Sheffield produced one of the 10 most potent offensive seasons of the ’90s. It was a Barry Bonds season in that era’s terms. In fact, Bonds also produced a Bonds season that year, and Sheffield was nonetheless slightly better.

Bonds: .308/.461/.615, 42 HRs, 179 wRC+

Sheffield: .314/.465/.624, 42 HRs, 185 wRC+

Sheffield walked in 21 percent of his plate appearances that season while striking out in just nine percent of them, making a mockery of BB:K. By adjusted stats, it was just a tad less dominant than Bryce Harper’s 2015.

He also achieved longevity, sustaining his performance at the plate so well that his career 141 wRC+ ended up matching Alex Rodriguez’s career mark and surpassing Reggie Jackson’s, clearly superior players who endured less graceful declines.

Going back to his place among the game’s most productive offensive players for a moment, Sheffield’s modern neighbors in this metric are Manny Ramirez (22nd all-time), Bagwell (23rd), Chipper Jones (28th), Mark McGwire (29th) and Jim Thome (30th).

Ramirez has well-known issues outside of his performance that affect voters’ view of him, but he is nearly Sheffield’s equal in negative defensive value. Meanwhile, Thome, who played a total of 55 innings in the field after 2005, will arrive on the ballot next year with 600 homers and a very good chance at the Hall. Just ahead of that group is Hall of Famer Frank Thomas, the 19th-most productive offensive player of all-time. He played no more than 50 games in the field in any season after 1997.

Are the complete stories of these players different for some reason of skill? Did the stalwart right fielder actually provide less to his team than the longtime designated hitters? Or are we glazing over extreme readings and incongruent circumstances that, if presented plainly, might attract our attention?

ALDS: New York Yankees v Los Angeles Angels of Anaheim - Game 5 Photo by Stephen Dunn/Getty Images

One of WAR’s stated goals is to provide a way to compare players across positions and skillsets and eras. To do that, it has to estimate the relative value of being able to play shortstop vs. first base, or catcher vs. right field. Or rather, it has to estimate the relative weight of responsibility that being able to convince a manager you should play shortstop bestows upon you.

This has some pitfalls. Sheffield probably should have never stepped foot on a Major League infield. But, when he came up with the Brewers, he stepped right into the shortstop role. That went poorly! Then, he slid to third base. That also went poorly! Still, he managed to stick there for four seasons – accumulating several bad seasons and one of the worst defensive seasons ever – before the Marlins saw enough after acquiring him in the middle of 1993. In 1994, he switched permanently to right field. Once the positional adjustment was factored in, his defensive value steadied itself at “really bad” instead of continuing to plummet toward the earth’s core.

And this is where WAR’s usage becomes tricky. It is extremely helpful in comparing the values of a competent shortstop who can hit and a not-very-good right fielder who can mash. But it is a formula, and it can’t fully account for the many factors that lie outside those player’s control; the logical breakdowns that might occur in the course of reality.

After all, pre-2002 defense is measured by Total Zone – a stat that is generated entirely from play-by-play logs. In 2002, UZR began it’s reign, as the data it requires was finally collected on a consistent basis. It is a much, much more advanced metric that uses batted ball type and location categorizations to produce run values above or below positional averages. But it still relies on the subjective observations of human stringers to create that data.

Here’s the possible breakdown reality has dealt us: Sheffield, while playing the outfield, had six seasons where he supposedly cost his teams more runs on defense than Thomas did in ANY season where he primarily served as DH. The extreme nature of Sheffield’s numbers is tough to overstate. Yet he was never shoved into the DH role, and at this point, it’s worth wondering whether his continued effort to play the field is casting an undue shadow over our impression of his career.

To get an idea of how this was intended to work, I went back to what UZR creator Mitchel Lichtman wrote in explaining how to use the stat:

Personally, no matter what size sample of data I look at, I always do a mental regression. For a one-year UZR, I mentally regress UZR halfway toward the mean, which means basically to “cut it in half” since the mean is defined more or less as zero. If you want to refine that “rule of thumb” a little, you can regress a player’s UZR (per 150 games) toward +2 for a fast player, -2 for a slow player, and zero for anyone in between.

Now, he was undoubtedly levying that advice to keep people from placing too much stock in small sample size. Sheffield provided a very large sample that said he was a very bad outfielder. However, it’s worth reading that again for the larger point: Extreme data points shouldn’t be taken ultra-literally.

FanGraphs’ explanation of its broader formulations has fine print in the same vein:

For position players, the largest point of contention comes in measuring defense and estimating the positional adjustment. Our measures of both are more uncertain than our measures of offense, so players who get a good amount of their value through their defensive ratings likely have more uncertainty around their WAR value than players who have defensive value closer to average.

Sheffield clearly does not deserve the benefit of the doubt Lichtman might offer players with only one or two years of data. But in 2004, Sheffield moved to the Yankees and saw his defensive runs above average drop to -21.1 from just -4.8 the previous year in Atlanta. The next year, it cratered at -34.1, which is either exaggerated or evidence of managerial malpractice.

He also had a -21.4 season wedged between slightly below average seasons, for instance. If we regress his extreme seasons (just in the outfield, since we can all acknowledge he was badly miscast in the infield) even by half of the half that Lichtman suggested, Sheffield is suddenly moving from good-but-borderline Cooperstown case to should-be lock (aside from the steroid whispers that won’t be going away).

The math problem then amounts to this: If you’re willing to consider that Sheffield’s negative defensive numbers might be overblown by 15 to 25 percent, you’re looking at a case that ranges from, “Well, he’s definitely better than Tony Gwynn,” to “Um, yes, we need to enshrine turn-of-the-millennium Reggie Jackson.”

Andruw Jones makes the catch Photo by Jonathan Daniel/Getty Images

That sounds easy enough, and in Sheffield’s extreme case, I’m personally ready to go there. Scroll through those same leaderboards, though, and you’ll see we are encountering a whole cohort of excellent players whose Cooperstown bids will rest – in one way or another – on how much weight we give to their exceptionally good or exceptionally bad defense or baserunning. Soon enough we will be considering exceptional pitch framers, or game callers, and their unknowingly terrible contemporaries.

If the voters take the steroid issue off the front burner, defense would be the only argument against Manny Ramirez. Vladimir Guerrero was a comically bad baserunner, by the metrics. The Edgar Martinez conversation will reach its peak a few short years before David Ortiz reaches the ballot. And on and on.

We can’t even see the edges of this puzzle we’re trying to solve, no matter how close it feels to completion. Consider this: Sheffield’s blip of defensive competence (barely below average!) occurred in 2002 and 2003, when he joined the Braves, and when UZR debuted. On that team, he played right field beside Andruw Jones – who was coming off a string of prime seasons in center that, today, help him stand as the 8th most productive defender ever. Jones posted still-excellent but slightly lower defensive ratings for the two years he played with Sheffield.

When Sheffield hopped to the Yankees in 2004, his numbers tanked alongside his new teammate, an aging Bernie Williams.

It’s possible we should just see these players as separate pieces of a puzzle that happened to fall together. It’s also possible they are intertwined in some way we don’t fully understand. It’s certain that we will make judgments about them, though, as Sheffield and Jones jockey for plaques and eternal recognition. And we owe it to our own interested minds, if not to them, to try and understand.

. . .

Zach Crizer is a featured writer at Beyond the Box Score. You can follow him on Twitter at @zcrizer.