clock menu more-arrow no yes

Filed under:

Does better data lead to better MVP selections?

New, 2 comments

With the explosion of data in the past 15-20 years, are MVP choices more aligned with the "best" player? The results may surprise you.

This is not how Clayton Kershaw celebrated winning both the Cy Young and MVP
This is not how Clayton Kershaw celebrated winning both the Cy Young and MVP
Jeff Gross/Getty Images

In 2014, the Most Valuable Player Awards in both leagues were given to the players with the best FanGraphs Wins Above Replacement (fWAR), something that hasn't happened since 2003 as this table shows:

Year Lg MVP fWAR Rank Lg MVP fWAR Rank
2014 AL Mike Trout 1 NL Clayton Kershaw 1
2013 AL Miguel Cabrera 3 NL Andrew McCutchen 1
2012 AL Miguel Cabrera 4 NL Buster Posey 1
2011 AL Justin Verlander 5 NL Ryan Braun 3
2010 AL Josh Hamilton 1 NL Joey Votto 2
2009 AL Joe Mauer 4 NL Albert Pujols 1
2008 AL Dustin Pedroia 6 NL Albert Pujols 1
2007 AL Alex Rodriguez 1 NL Jimmy Rollins 7
2006 AL Justin Morneau 39 NL Ryan Howard 8
2005 AL Alex Rodriguez 1 NL Albert Pujols 2
2004 AL Vladimir Guerrero 6 NL Barry Bonds 1
2003 AL Alex Rodriguez 1 NL Barry Bonds 1
2002 AL Miguel Tejada 26 NL Barry Bonds 1
2001 AL Ichiro Suzuki 6 NL Barry Bonds 1
2000 AL Jason Giambi 5 NL Jeff Kent 5

Generally speaking, I won't complain if a player with a top 5 fWAR wins the MVP, because there's often not a meaningful separation between the players and other non-numerical influences can (and should) play a role. With sites like FanGraphs, Baseball-Reference and Baseball Prospectus and the ease with which players can be compared on an objective basis, I wondered if voters made better selections with more information.

To test this, I compared the top 10 in the MVP vote for each league and added up their fWAR ranks. I wanted to know if the sum of the fWAR ranks for the top MVP candidates was lower than it was in the prior decades, which would suggest players with higher fWAR were receiving the votes. I went back to the 1930s (1931, specifically), because before 1931, there were MVP awards but with several conditions that didn't necessarily allow for the best player to receive it.

I used some simple statistical tests, and the results (and much more) can be viewed in this Google Docs spreadsheet for those interested. The table below shows the average fWAR rank for the players in the top 10 in MVP voting by decade:

League 2000s 1990s 1980s 1970s 1960s 1950s 1940s 1930s
American 18.5 20.6 25.9 25.9 17.1 17.5 13.1 12.6
National 17.0 17.0 24.5 20.2 17.6 13.8 13.9 17.7

I was extremely surprised to find that there was little difference -- generally speaking, the voters from the days of old were just as adept at selecting the "right" player as today's voters, and they did it without the explosion of data we've seen. Granted, they had access to the same basic stats we have today, but they didn't have the technology or ability to see much beyond the teams they covered and their opponents and hardly any opportunity to see players from the other league.

There are numerous reasons why the person ranked with the highest fWAR doesn't receive the MVP, several of which I happen to agree with. Sometimes the player with the highest fWAR is a pitcher, and they have their own award. Unless they're so dominant, or in Clayton Kershaw's case this year, if no other position player dominates, chances are they won't win the MVP. In addition, some bias toward players on playoff teams might occur. I'm inclined toward both of these sentiments, which is why I had no problem with Miguel Cabrera winning the AL MVP in 2012 and 2013 even though Mike Trout had a better fWAR -- in both years, Cabrera's team made the playoffs, Trout's didn't. Having written that, I'm still surprised Cabrera won in 2013 given that he was hurt in September and Trout had a very strong final month, which is another thing that can affect the voting.

There were exceptions in the 1970s and 1980s. The introduction of the closer and the designated hitter provided a novelty factor that received voter attention and likely contributed to the following players winning the MVP award:

Year Lg Player fWAR Rank
1984 AL Willie Hernandez 61
1981 AL Rollie Fingers 40
1979 AL Don Baylor 41

Conversely, in the 1930s and 1940s there were fewer players, and fewer opportunities to make "bad" choices. It's almost impossible to vote for a player as an MVP candidate whose fWAR rank is 100 or lower because there were barely that many eligible players in the league.

There are shortcomings in this analysis, the primary one being my extreme unease in lumping relief pitchers in with other players. No baseball measure tells the entire story of a player's performance, but fWAR comes very close in giving an accurate portrayal of a player's contribution in all facets of the game. Relievers will not have as high an fWAR as position players, which is why the seasons in which a reliever won the Cy Young skewed the data (the same happened with Dennis Eckersley in 1992).

Generally speaking, the voters got it right, and this chart shows MVPs by fWAR rank:

fWAR Rank AL NL
1 31 27
2-5 26 33
6-10 14 14
11-20 3 7
21+ 10 4

There will always be non-numerical issues like market size, pitcher vs. position player, not giving the same great player the award year after year even when he deserves it and a number of others. I'm not advocating changing the name to the #1 fWAR Award, but it's encouraging when the best players are acknowledged as such.

. . .

All data from FanGraphs and Baseball-Reference

Scott Lindholm lives in Davenport, IA. Follow him on Twitter @ScottLindholm.

To see this data in a different format, view these Tableau data vizzes. There are a total of five different sheets that each have a variety of ways to filter and view the data.