I submitted this peice to the BP Idol contest, but apparently they have already emailed the winners, and I wasn't one of the lucky ten. So I thought that I would post my peice here. [Sky: you're good enough for us! I really enjoyed reading this piece, and I respect how you solved the problem of BPro not making historical third-order wins available to the public.]
There has been an age old argument about who should win the MVP each year. Some writers, like Sky, will simply endorse the best player by WPA or WAR; while other, generally more mainstream, writers will go to the "playoff argument". Simply put, they say something like "Ryan Howard should be the MVP, despite being about 2.5 times worse of a player that Albert Pujols, because without him, the Phillies wouldn't have made the playoffs".
While that is true (by most estimates Howard was a little better than a 3 win player last year, and his team won the division by exactly 3 games), it is a very limited way to look at it. For example, you could say the same thing about Utley, Werth, Victorino, Hamels and maybe even Burrell last year, and those are just guys on the Phillies who meet that requirement. To differentiate between candidates, it is necessary to figure out to what extent those players contributed to their team making the playoffs.
So to go about doing this, we first have to figure out the chances of any given team making the playoffs. As you all know, to make the playoffs you either have to win more games than anyone in your division, or more games than any non division winner in the league. So obviously a team that wins 90 games will have a much greater chance of making the playoffs than a team that wins 75 games.
Creating a logistic regression using the win totals of each team in the past 11 season (since the Rays have been in the league) allows us to estimate the probabilities of that 90 win team and that 75 win making the playoffs in the AL and the NL:
A logistic regression basically works by imputing a set of numbers (wins) and showing which of them met a certain outcome (playoff birth). Then by using a program to model the data, we can see a curve illustrating the probabilities of a range of those numbers and the percentage that they will reach the specific outcome.
As you can see, it is much harder to make the playoffs in the AL than in the NL. I find it interesting, given the fact that there are two more teams in the NL. However, when you think about it, it isn't that surprising. The AL has the DH, the Yankees and the 116 win Mariners team in 2001 (that high win total might be skewing the regression a bit). The NL, on the other hand, features the NL West which was won by the Dodgers last year with just 84 wins.
So a team that wins 90 games will make the playoffs 46.2% of the time in the AL and 56.2% of the time in the NL, whereas a team that wins 75 games will never make the playoffs in the AL and has only a .4% chance of making it in the NL. So you took 1 win away from the 75 win team, their playoff odds would only decrease by .1% in the NL and it wouldn't decrease at all in the AL as it was already 0% before. However, if you took that 1 win away from the 90 win team, their playoffs probability would decrease by 9.6% in the NL and 15% in the AL. That is a substantial difference in the value of a win to those two ball clubs.
While that is pretty interesting, it still doesn't help us value players' contributions that much. For one, real wins can't be measured by WAR, WARP, VORP, or any other acronym that I know of. Those kind of stats only measure how many context neutral runs a player adds to his team, and then converts those runs to wins based on the historical value of runs to a teams win totals (around 10 runs per win). So in other words, they measure theoretical runs and then convert those theoretical runs into theoretical wins. When a player provides a certain amount of WAR to his team, that simply raises his teams true talent level so that they can be expected to win a certain amount of games, however there are many variables that allow a team to under or over perform their true talent level:
* Luck
* Run distribution
* Clutchness
* Strength of schedule
* Brad Lidge
Fortunately, Baseball Prospectus offers 3rd Order Wins, which take into account the context neutral performance of teams via EqA (which is what is used for hitter WARP) and strength of schedule. That effectively takes out most of the variables in real wins, and it allows us to make a direct comparison between a teams true talent level and context neutral player value. To make use of that, I had to find out how the probabilities of a team's chances of making the playoffs with a given amount of 3rd Order Wins. So I used Wayback Machine's Internet Archives website to look at BP's adjusted wins for each team over the past three seasons. I then made a logistic regression using 3rd Order Wins instead of real wins, which allowed for a broader curve in both leagues:
So, while that 75 win team will rarely ever make the playoffs in either league, a team with a true talent level of 75 wins will sneak in around 5% of the time in either league. In fact, the 2006 World Champion Cardinals had only 75.8 3rd Order Wins.
So now that we an accurate depiction of a team's playoff chances given their true talent level, we can measure a players contributions to his team chances making the playoffs. Using FanGraph's WAR to value players, I created a two spreadsheets showing every single player's playoff probability added (I disregarded pitcher hitting).
Here are the top ten in each league:
NL:
1. David Wright (41.84%)
2. Chase Utley (41.11%)
3. Carlos Beltran (38.86%)
4. Jose Reyes (35.08%)
5. Albert Pujols (34.00%)
6. Jimmy Rollins (30.60%)
7. Jayson Werth (30.60%)
8. Derek Lowe (30.53%)
9. Johan Santana (29.30%)
10. Russell Martin (28.19%)
AL:
1. Roy Halladay (41.44%)
2. Alex Rodriguez (33.60)
3. A.J. Burnett (33.16%)
4. Alex Rios (32.17)
5. Mike Mussina (30.42%)
6. Andy Pettitte (25.94%)
7. Evan Longoria (25.22%)
8. John Danks (22.99%)
9. B.J. Upton (22.86%)
10. Derek Jeter (22.18)
As you can see, a lot of the people who supposedly vaulted their clubs into the playoffs (CC Sabathia, Manny Ramirez, Ryan Howard) aren't even in the top ten in that category. CC was #15 (which is really incredible when you consider that he was only in the league for a couple of months); however, you have to scroll all the way down to #25 to find Manny and #27 to find Howard.
Pujols was obviously a defensible choice. He wasn't blessed by being on one of those sweet 89 win teams, on which each win is incredibly important. However, simply by his sheer dominance of the league, he was able to crack the top 5.
Pedrioa on the other hand, might not have been a great choice. He had a very good year (although probably not the best in the AL), but his team had a true talent level of over 102 wins, meaning that they would make the playoffs nearly 95% of the time. If you took Pedrioa off of the Red Sox, they would still have had over a 78% chance of making the playoffs, so he really wasn't that important to the team last year.
Roy Halladady blew away the competition in the AL. Not only was he the best player by WAR, but he also contributed the most to his team's chances of making the playoffs. He might not have been the Cy Young due to Cliff Lee's dominance, however he definitely deserved some MVP votes.
The most unfortunate player last year was Nick Markakis, who racked up a 6.2 WAR season. However, given that the Orioles had virtually no shot of making the playoffs, he just wasn't that valuable in comparison to some lesser players on better teams.
The least valuable player was Mark Sweeney of the Dodgers. He was "worth" -1.5 WAR, despite only having 104 plate appearances. His playoff probability added was -7.92%, meaning that if he wasn't on the team, they would have had nearly and 8% better chance of making the playoffs.
You can access the full spreadsheets for each league here: