clock menu more-arrow no yes mobile

Filed under:

Are fWAR and rWAR on Different Scales?

Twice when working on last week's articles about double plays I came across large differences in career WAR for some players, depending on whether you used Fagraphs' WAR (fWAR) or Rally's WAR (rWAR, which is used at Baseball-Reference.com). In the first example, Jim Rice had 41.5 rWAR but 56.1 fWAR. That's a pretty big difference. But in the second example, I found that Brooks Robinson scored at an impressive 69.1 rWAR but also boasted a staggering 94.6 fWAR.

This baffled me. I'm well aware of the differences in the pitching inputs used for each WAR metric. And both also use different defensive metrics for hitters. But for older players, they both use the same defensive metrics (which is Rally's Total Zone, since UZR was not yet available). I had been under the impression that both systems calculate offensive production similarly. Apparently I was wrong. Very wrong.

One early observation I made was that Fangraphs WAR tended to be higher. Today's graph shows how many more position players have 30+ career WAR in the Fagraphs system than in Rally's.

Rwarfwar_medium

Yikes. This has me wondering if fWAR is on a completely different scale than rWAR. Does Fangraphs think offense is simply worth more runs than Rally does? Is Fangraphs merely top-heavy and this all evens out in the end? I don't have answers to these questions (yet), but I did take a look at which players were most affected by this difference.

I pulled the Top 200 position players by career rWAR and compared that with their fWAR. First, I'll start with the shorter list—those who actually have a lower fWAR than rWAR:
Player rWAR fWAR Diff
Cap Anson 99.5 88.7 -10.8
Bid McPhee 57.9 51.4 -6.5
Johnny Damon 48.3 41.8 -6.5
Sam Thompson 46.7 41.2 -5.5
Ichiro Suzuki 55.2 50.7 -4.5
George Davis 90.7 86.4 -4.3
Buck Ewing 51.8 47.5 -4.3
Dan Brouthers 83.7 80.1 -3.6
King Kelly 48.5 45.0 -3.5
Albert Pujols 83.8 80.6 -3.2
Will Clark 57.6 54.4 -3.2

 

Cap Anson takes a serious hit. Otherwise, it calms down pretty quick. What strikes me is that Anson, McPhee, Thompson, Davis, Ewing, Brouthers, and Kelly all played most or all of their careers in the 1800s. I'm also seeing some active players (Damon, Suzuki, Pujols) and the somewhat recently active Will Clark. I'm not sure if there's anything to that.

Here's the meaty list, where fWAR is greater than rWAR. Robinson's 25.5 win (not run!) difference takes the cake:

Player rWAR fWAR Diff
Brooks Robinson 69.1 94.6 25.5
Carl Yastrzemski 88.7 108.7 20.0
Jimmie Foxx 94.1 112.3 18.2
Harmon Killebrew 61.1 78.4 17.3
Tony Perez 50.5 67.8 17.3
Pete Rose 75.3 91.4 16.1
Max Carey 50.6 66.6 16.0
Luke Appling 69.3 84.7 15.4
Honus Wagner 134.5 149.8 15.3
Joe Torre 55.6 70.8 15.2
Sherry Magee 59.1 74.1 15.0
Al Simmons 63.6 78.5 14.9
Ted Williams 125.3 139.8 14.5
Lou Boudreau 56.0 69.8 13.8
Luis Aparicio 49.9 63.6 13.7
Willie Stargell 57.5 70.9 13.4
Bob Johnson 53.2 66.4 13.2
Jimmy Sheckard 51.8 65.0 13.2
Bobby Doerr 47.7 60.9 13.2
Joe Tinker 49.2 62.2 13.0
Joe Cronin 62.5 75.4 12.9
Ron Santo 66.4 79.3 12.9
Billy Williams 57.2 69.7 12.5
George Sisler 50.4 62.8 12.4
Joe Gordon 54.9 67.2 12.3
Zack Wheat 57.8 70.0 12.2
Eddie Murray 66.7 78.8 12.1
Norm Cash 52.9 64.8 11.9
Stan Musial 127.8 139.3 11.5
Orlando Cepeda 46.8 58.3 11.5
Hank Greenberg 56.8 68.2 11.4
Al Kaline 91.0 101.9 10.9
Fred McGriff 50.5 61.3 10.8
Ted Simmons 50.4 61.1 10.7
Willie McCovey 65.1 75.7 10.6
Andruw Jones 59.9 70.5 10.6
Darrell Evans 57.3 67.8 10.5
Johnny Bench 71.3 81.5 10.2
Graig Nettles 61.6 71.8 10.2

 

Some of these differences are just enormous. This is only the list of players with a ten-win difference. A total of 110 of the top 200 players by rWAR have a difference of five wins or more.

Often, we use these metrics interchangeably—and I've even read quite a few articles where these are avearged together. Should we be doing this? Are they on different scales? Are my personal rWAR classifications (70+ sure HOFer, 50-70 rWAR are interesting cases, less than 50 you need a damn good reason to be a HOFer) applicable to fWAR?

How about the fact that we tend to use rWAR when discussing Hall of Fame cases by fWAR when talking about the current season. If these produce very different results over time, at what point does that become a problem?

I don't have the answers to these questions yet, but my interest has been piqued.

And, gosh—we haven't even looked at pitchers yet…