clock menu more-arrow no yes mobile

Filed under:

WAR: Pros, Cons, and Alternatives

Do you think WAR is flawed? Are you tired of the same-old context-independent player contribution metrics? Do you want to support a local sabermetric business rather than evil giant corporations like FanGraphs and Baseball-Reference? Then try mhWAR! (Patent pending)

Jesse Johnson-US PRESSWIRE

I love WAR. WAR is amazing. Honestly, if you don't think WAR is amazing, I don't think you're looking hard enough. Yes it has flaws. Yes it is often inaccurate and misleading. But we should not let that prevent us from seeing WAR how it really is: a great tool that allows us to both compare players across teams, leagues, and eras, and gauge their value to their team.

I like to think of WAR like weather forecasts. It would be really easy to completely dismiss weather forecasts for their inaccuracy. Because they are wrong. A lot. In fact, they are almost always at least slightly wrong, and often times they are extremely wrong! This is frustrating, and I could see why it would make people completely distrust forecasts.

And yet, where would we be without weather forecasts? We would have no idea what the weather will be like in 12 hours, let alone a week. We would be clueless.

Does that mean that weather forecasts can't be improved? Of course not. They are inaccurate and meteorologists should always be working to make them as accurate as possible. In the same way, sabermetricians should always be working to improve metrics like WAR to make them as accurate as possible. I think that this happens, though it may not happen enough. But that's beside the point.

Another issue with WAR lies not with those who created and maintain it, but with those who use it. Jon Heyman, for example, has recently been tweeting out some "WAR mysteries of the week", pointing out some seemingly problematic WAR comparisons. And he's right that some of them are problematic!

Yet, the problems that Heyman has with WAR are similar to the problems that some people might have with weather forecasts. He's taking an amazing metric, one that does far more than any other single metric can do, and complaining about specific inaccuracies. Now that's fine, just as we have the right to complain about a forecast of 80 and sunny when it snows instead, as long as we acknowledge that as a whole, WAR (and weather forecasts) do a pretty damn good job.

But I'm getting away from the topic that I really want to talk about. Unlike weather forecasts, we have the tools to fix, or at least to change, the way that we use WAR. We know the components that go into WAR, and we can adjust them and ignore them as we see fit.

Let me explain. Generally, at this point in the season, I don't use WAR. I'll glance at it for fun, but if I want to compare the performance of the top players, I don't look at WAR. I make my own WAR. I make my own WAR out of the components that we already have publicly available to us.

Note: As will become obvious, I'm only looking at position player WAR. Pitching WAR is a whole different animal that I won't even pretend to touch here.

My Version of WAR

First of all, here are the basic components of WAR, specifically the one that FanGraphs uses:

Batting Runs + Baserunning Runs + Replacement level runs + Positional Adjustment + Fielding Runs

There are two core things that I do with my WAR. First, I include context. As it stands now, every version of WAR is context-independent. It looks at the outcome of every play, but doesn't consider the situation from which that outcome came about. I think that context is important, not for evaluating a player's true talent, but for evaluating a player's contribution to the team.

For that reason, I remove Batting Runs and replace it with RE24. I've talked a lot about RE24 on here and on the Hardball Times, and I haven't been shy about expressing my love for the metric. It's beautifully simple, yet extremely descriptive, and answers a question that we all want to know: how many runs did the player produce relative to what an average player would be expected to produce?

Or, more specifically, what was the difference between how many runs the team was expected to score before the play and how many they are expected to score after the play? That difference, summed up over every plate appearance, is RE24. It's fantastic, and I use it instead of context-independent Batting runs.

Secondly, I remove fielding (UZR, DRS, FRAA, etc). Especially at this point in the season, defensive metrics are very unstable. They're not useless, and they tell us more than nothing, but they are very deceptive, and I prefer to just leave them out of my equation.

After those two changes, here are my new components:

RE24 + Baserunning Runs + Replacement level runs + Positional Adjustment

Before I move one, note that I did not say that I ignore fielding. I remove it from my equation, but I do not ignore it. Fielding is important - very important - and to ignore it would be to ignore a crucial part of baseball. But instead of including the defensive metrics in the equation, I calculate everyone's mhWAR (my own version of WAR), and then consider how fielding could affect these numbers.

To explain this further, I'm going to give you my top 20 leaderboard based on my WAR alternative, and explain my thought process for how to include defense.

Name RE24 BsR Rep Pos mhRAR mhWAR
Miguel Cabrera 34.19 0.6 6.2 0.7 41.69 4.4
Shin-Soo Choo 23.00 1.4 6.4 0.7 31.50 3.3
Mike Trout 19.84 2.8 6.4 -0.6 28.44 3.0
Chris Davis 25.54 -0.3 5.6 -3.5 27.34 2.9
Joey Votto 21.22 1.1 6.5 -3.6 25.22 2.7
Carlos Santana 19.34 -1.0 5.1 1.1 24.54 2.6
Evan Longoria 17.78 -0.6 5.8 0.3 23.28 2.5
David Wright 12.37 4.1 5.4 0.6 22.47 2.4
Alex Gordon 17.05 1.3 5.8 -2.0 22.15 2.3
Paul Goldschmidt 18.72 0.8 5.9 -3.6 21.82 2.3
Jean Segura 13.09 1.1 5.5 1.9 21.59 2.3
Robinson Cano 14.51 0.2 5.9 0.6 21.21 2.2
Andrew McCutchen 11.85 2.8 5.6 0.7 20.95 2.2
Starling Marte 13.64 3.1 6.0 -1.9 20.84 2.2
Troy Tulowitzki 15.08 -1.1 4.8 1.7 20.48 2.2
Brandon Phillips 15.07 -1.3 6.0 0.7 20.47 2.2
Justin Upton 14.31 2.3 5.7 -2.0 20.31 2.1
Josh Donaldson 13.83 -0.2 5.7 0.7 20.03 2.1
Prince Fielder 19.96 -2.7 6.1 -3.5 19.86 2.1
Buster Posey 12.13 -1.1 5.2 2.5 18.73 2.0

To translate runs to wins, I just divided by 9.5, as FanGraphs does to translate RAR to WAR.

As you can see, when we use RE24, Miguel Cabrera blows everyone else out of the water. He's has been an absolutely ridiculous hitter this year, and even more ridiculous in the clutch.

But this table doesn't include fielding, as I explained above. Yet instead of just adding a defensive metric to my WAR, all I do is consider the differences between the players, and what sort of fielding value would be needed in order to make up the differences. To do so, I do, of course, look at other defensive metrics like UZR, DRS, and FRAA in order to get a sense of player's defense, but I also adjust with my own perception of their defensive value. For reference, here are what the defensive metrics say about the above players:

Miguel Cabrera -6 -7 -3.3 -5.43
Shin-Soo Choo -0.8 -9 -3.2 -4.33
Mike Trout 0.7 -6 -0.3 -1.87
Chris Davis 3.8 -3 -1.2 -0.13
Joey Votto -8.5 5 4.9 0.47
Carlos Santana 1.7 -8 -0.6 -2.30
Evan Longoria 4 2 1.9 2.63
David Wright 0.4 1 1.9 1.10
Alex Gordon -2.3 4 -0.4 0.43
Paul Goldschmidt 1.7 7 -2.7 2.00
Jean Segura 2.5 1 2.9 2.13
Robinson Cano -1.7 -3 5.1 0.13
Andrew McCutchen 0.4 2 0 0.80
Starling Marte -0.8 8 0 2.40
Troy Tulowitzki 2.6 5 4.2 3.93
Brandon Phillips 0.7 3 0 1.23
Justin Upton -5.7 1 1.1 -1.20
Josh Donaldson 0.6 0 2.1 0.90
Prince Fielder 6.4 -4 2 1.47
Buster Posey 2.3 -1 0.2 0.50

*Keep in mind that for catchers, the calculations are much different in each of these metrics.

**I just realized that these numbers are completely wrong. Ignore them until I have a chance to create a new table. Sorry!

Looking at the two tables, it seems pretty likely that no one has enough defensive value to catch Cabrera. Choo would need to be about 10 runs better, and Trout 13 runs better, in order to make up for his lead in offense. Choo clearly has not performed well in center, so he is out, but is it possible that Trout has been worth more than 13 runs more on defense? Sure! If Cabrera has been worth, say, -7 defensive runs and Trout has been worth +7, Trout would be more valuable! Is this likely? No. But it's possible, and we need to consider that possibility.

And this is one of the main points I want you to draw from this surprisingly lengthy article: WAR is not precise. WAR is probably best represented as a range, or at the very least a rough estimate, of a player's contribution. Using my calculations, Cabrera is far ahead of the field when you don't include defense, but your final evaluation completely depends on how you view the players' defensive contributions. You should use the metrics we have available, but you should also use whatever other resources you have.

If you get anything out of this piece, I want it to be these points:

1. WAR is extremely useful, but also an estimate, and often an inaccurate estimate.

2. If you have issues with WAR, or you think it should be calculated differently, do something about it! We have all the data at our fingertips, and most sites have all the components of WAR clearly listed out. If you don't like the defensive aspect of it, just take it out! If you want to include context, use RE24 or WPA or whatever else suits your mood!

3. Don't view WAR as a set-in-stone number, but as the one value in a range of possible values. If we always keep in mind, especially at this point in the season, that there are many things we don't know about a player's contribution to his team, then we can have a more realistic view of WAR, and use it much more to our advantage.