Rating Players Using Playoff Probability Added
I submitted this peice to the BP Idol contest, but apparently they have already emailed the winners, and I wasn't one of the lucky ten. So I thought that I would post my peice here. [Sky: you're good enough for us! I really enjoyed reading this piece, and I respect how you solved the problem of BPro not making historical third-order wins available to the public.]
There has been an age old argument about who should win the MVP each year. Some writers, like Sky, will simply endorse the best player by WPA or WAR; while other, generally more mainstream, writers will go to the "playoff argument". Simply put, they say something like "Ryan Howard should be the MVP, despite being about 2.5 times worse of a player that Albert Pujols, because without him, the Phillies wouldn't have made the playoffs".
While that is true (by most estimates Howard was a little better than a 3 win player last year, and his team won the division by exactly 3 games), it is a very limited way to look at it. For example, you could say the same thing about Utley, Werth, Victorino, Hamels and maybe even Burrell last year, and those are just guys on the Phillies who meet that requirement. To differentiate between candidates, it is necessary to figure out to what extent those players contributed to their team making the playoffs.
So to go about doing this, we first have to figure out the chances of any given team making the playoffs. As you all know, to make the playoffs you either have to win more games than anyone in your division, or more games than any non division winner in the league. So obviously a team that wins 90 games will have a much greater chance of making the playoffs than a team that wins 75 games.
Creating a logistic regression using the win totals of each team in the past 11 season (since the Rays have been in the league) allows us to estimate the probabilities of that 90 win team and that 75 win making the playoffs in the AL and the NL:

A logistic regression basically works by imputing a set of numbers (wins) and showing which of them met a certain outcome (playoff birth). Then by using a program to model the data, we can see a curve illustrating the probabilities of a range of those numbers and the percentage that they will reach the specific outcome.
As you can see, it is much harder to make the playoffs in the AL than in the NL. I find it interesting, given the fact that there are two more teams in the NL. However, when you think about it, it isn't that surprising. The AL has the DH, the Yankees and the 116 win Mariners team in 2001 (that high win total might be skewing the regression a bit). The NL, on the other hand, features the NL West which was won by the Dodgers last year with just 84 wins.
So a team that wins 90 games will make the playoffs 46.2% of the time in the AL and 56.2% of the time in the NL, whereas a team that wins 75 games will never make the playoffs in the AL and has only a .4% chance of making it in the NL. So you took 1 win away from the 75 win team, their playoff odds would only decrease by .1% in the NL and it wouldn't decrease at all in the AL as it was already 0% before. However, if you took that 1 win away from the 90 win team, their playoffs probability would decrease by 9.6% in the NL and 15% in the AL. That is a substantial difference in the value of a win to those two ball clubs.
While that is pretty interesting, it still doesn't help us value players' contributions that much. For one, real wins can't be measured by WAR, WARP, VORP, or any other acronym that I know of. Those kind of stats only measure how many context neutral runs a player adds to his team, and then converts those runs to wins based on the historical value of runs to a teams win totals (around 10 runs per win). So in other words, they measure theoretical runs and then convert those theoretical runs into theoretical wins. When a player provides a certain amount of WAR to his team, that simply raises his teams true talent level so that they can be expected to win a certain amount of games, however there are many variables that allow a team to under or over perform their true talent level:
* Luck
* Run distribution
* Clutchness
* Strength of schedule
* Brad Lidge
Fortunately, Baseball Prospectus offers 3rd Order Wins, which take into account the context neutral performance of teams via EqA (which is what is used for hitter WARP) and strength of schedule. That effectively takes out most of the variables in real wins, and it allows us to make a direct comparison between a teams true talent level and context neutral player value. To make use of that, I had to find out how the probabilities of a team's chances of making the playoffs with a given amount of 3rd Order Wins. So I used Wayback Machine's Internet Archives website to look at BP's adjusted wins for each team over the past three seasons. I then made a logistic regression using 3rd Order Wins instead of real wins, which allowed for a broader curve in both leagues:

So, while that 75 win team will rarely ever make the playoffs in either league, a team with a true talent level of 75 wins will sneak in around 5% of the time in either league. In fact, the 2006 World Champion Cardinals had only 75.8 3rd Order Wins.
So now that we an accurate depiction of a team's playoff chances given their true talent level, we can measure a players contributions to his team chances making the playoffs. Using FanGraph's WAR to value players, I created a two spreadsheets showing every single player's playoff probability added (I disregarded pitcher hitting).
Here are the top ten in each league:
NL:
1. David Wright (41.84%)
2. Chase Utley (41.11%)
3. Carlos Beltran (38.86%)
4. Jose Reyes (35.08%)
5. Albert Pujols (34.00%)
6. Jimmy Rollins (30.60%)
7. Jayson Werth (30.60%)
8. Derek Lowe (30.53%)
9. Johan Santana (29.30%)
10. Russell Martin (28.19%)
AL:
1. Roy Halladay (41.44%)
2. Alex Rodriguez (33.60)
3. A.J. Burnett (33.16%)
4. Alex Rios (32.17)
5. Mike Mussina (30.42%)
6. Andy Pettitte (25.94%)
7. Evan Longoria (25.22%)
8. John Danks (22.99%)
9. B.J. Upton (22.86%)
10. Derek Jeter (22.18)
As you can see, a lot of the people who supposedly vaulted their clubs into the playoffs (CC Sabathia, Manny Ramirez, Ryan Howard) aren't even in the top ten in that category. CC was #15 (which is really incredible when you consider that he was only in the league for a couple of months); however, you have to scroll all the way down to #25 to find Manny and #27 to find Howard.
Pujols was obviously a defensible choice. He wasn't blessed by being on one of those sweet 89 win teams, on which each win is incredibly important. However, simply by his sheer dominance of the league, he was able to crack the top 5.
Pedrioa on the other hand, might not have been a great choice. He had a very good year (although probably not the best in the AL), but his team had a true talent level of over 102 wins, meaning that they would make the playoffs nearly 95% of the time. If you took Pedrioa off of the Red Sox, they would still have had over a 78% chance of making the playoffs, so he really wasn't that important to the team last year.
Roy Halladady blew away the competition in the AL. Not only was he the best player by WAR, but he also contributed the most to his team's chances of making the playoffs. He might not have been the Cy Young due to Cliff Lee's dominance, however he definitely deserved some MVP votes.
The most unfortunate player last year was Nick Markakis, who racked up a 6.2 WAR season. However, given that the Orioles had virtually no shot of making the playoffs, he just wasn't that valuable in comparison to some lesser players on better teams.
The least valuable player was Mark Sweeney of the Dodgers. He was "worth" -1.5 WAR, despite only having 104 plate appearances. His playoff probability added was -7.92%, meaning that if he wasn't on the team, they would have had nearly and 8% better chance of making the playoffs.
You can access the full spreadsheets for each league here:
3 recs |
26 comments
|
Comments
Great piece
It’s quite interesting to see, once again, how perception and statistical analysis differ and make you question the sanctioned perceivers.
"Swinging and missing to me is like 'Jesus, what happened?'" Scott Hatteberg
by Razr on May 15, 2009 11:23 PM EDT reply actions 0 recs
The only reason why I'm still up
is because I totally dig this post. That and I’m a Cardinals’ fan too. But seriously, one of the best baseball reads I’ve had in a while. Awesome as always.
"If Bowden was a general contractor, he'd build houses with nine bedrooms, six garages, no bathrooms, and half a roof."
by DyeLongJustice on May 16, 2009 3:10 AM EDT reply actions 0 recs
Great job
Recommended. I’ll have to read it a couple more times to totally get it.
Also: Roy Halladay rules
I'm not a sabermetrician, but I do play one at Driveline Mechanics.
by devil_fingers on May 16, 2009 10:54 AM EDT reply actions 0 recs
Neato
It’s fun to see a serious, statistically-backed attempt to tackle the “playoff” perspective of the value question. I don’t agree that it is the best way to pick an MVP winner, for example, but this is still a great idea.
by mattybobo on May 16, 2009 12:33 PM EDT reply actions 0 recs
Sabathia and Manny thoughts.
As you can see, a lot of the people who supposedly vaulted their clubs into the playoffs (CC Sabathia, Manny Ramirez, Ryan Howard) aren’t even in the top ten in that category. CC was #15 (which is really incredible when you consider that he was only in the league for a couple of months); however, you have to scroll all the way down to #25 to find Manny and #27 to find Howard.
Using Sabathia as an example, he was good for 4.7 WAR in his half-season with the Brewers. Now, I don’t remember whom he replaced, but I suspect that it wasn’t a guy who is likely to put up a 2.7 WAR in a half-season. Even if he was, that would have theoretically left the Brewers 1.0 game short of the Wild Card berth.
Manny Ramirez gets the same kind of argument. He was the “final piece” for the Dodgers and was worth 3.4 WAR in 1/3 of a season. The Dodgers won the NL West by only 2 games.
The marginal measurement of playoff probability added only measures the value of the player as the “final piece.” Since your method re-figures WAR into a marginal playoff probability calculation, an actual “final piece” – like Manny or CC – suffers from a skewed perspective.
Yes, Manny Ramirez and CC Sabathia contributed less to their teams’ win totals than several other players, but it can easily be argued that both in fact vaulted their teams to the playoffs.
I enjoyed the article, but the paragraph about Sabathia and Ramirez might have cost you some Idol points.
I liked your use of 3rd Order Wins. Have you thought of any ways to improve upon this? I’m not saying it can be improved, but what would your next step be?
by NoNameOnCard on May 16, 2009 5:00 PM EDT reply actions 0 recs
I'm not sure I understand your criticism
Maybe I’m oversimplifying things, but the way I see it, is without Sabathia, the Brewers would have had about 82 3rd order wins (maybe more depending on who he replaced, but I don’t consider that in this method), which accordingly to my calculations, gives them roughly a 25% chance of making the playoffs. With Sabathia, their true talent level goes up to about 86.5 wins, which gives them roughly a 52% chance of making the playoffs. Despite the fact that he was indeed the final piece and one that probably did “push” them into the playoffs, it doesn’t change the fact that he didn’t add as much total marginal playoff points as other candidates. Maybe I’m looking at it the wrong way, but it all seems good in my head.
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 16, 2009 5:26 PM EDT up reply actions 0 recs
I think NoName is giving credence to the fact that Manny and CC were added later than the other players.
There’s a psychological effect to voting, where bringing in a player is a lot more exciting than saying “oh my gosh, Ryan Bruan is STILL on the Brewers.” That was expected. CC was not.
Not saying I agree with that point of view at all (I hate it), but it might be tough to convince people who think that way to vote by following your methodology, because their line of thought isn’t in total agreement with “yours”.
Looking forward to you using this approach on 2009 data eventually.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on May 16, 2009 6:09 PM EDT up reply actions 0 recs
I think I should change the way I calculate playoff odds
to account for strength of the division. Because obviously the Dodgers had a much greater chance of making the playoffs than the logistic regression gives them credit for, simply because their division was so bad. Do you have any tips of how to do that?
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 16, 2009 6:21 PM EDT up reply actions 0 recs
In order to keep the approach similar...
you need to keep things probabilistic. I.e., you don’t want to use a function that says winning 84 games gives 100% chance of making playoffs and 82 games gives 0% chance of making playoffs, if the second place team actually wins 83 games.
Could you do something like create a wins probability function by taking each team’s actual wins as their true talent and using the binomial distribution to show their likelihood of winning X number of games, and then combining all other teams in a division together to produce a probability distribution for the most wins in a division other than the team you’re looking at, and use that as how many games they need to win to make the playoffs and figure their chances of doing that? Uh, did that make sense? It barely does to me re-reading it.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on May 16, 2009 6:48 PM EDT up reply actions 0 recs
And it sounds really hard.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on May 16, 2009 6:49 PM EDT up reply actions 0 recs
That makes sense
I understand how to do that for winning the division. Would the same approach work for the wild card?
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 16, 2009 9:06 PM EDT up reply actions 0 recs
Also
I don’t like using real wins as their true talent level because those don’t correlate directly to WAR. How would include 3rd order wins in that equation. Would bayesian inference work?
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 16, 2009 9:23 PM EDT up reply actions 0 recs
Combining opportunities to make the playoffs and/or win Wild Card seems really tough.
A simulation might actually be the best route, taking each team’s 3rd order winning percentage as their true talent level.
An “easy” simulation would be simply to spit out a win total for every team using the binomial distribution, note that total for every team, and note if they made the playoffs or not. I think that would be feasible in Excel with some creative use of functions. Then, rinse and repeat, oh, a gazillion times (or whatever the experts suggest). That seems harder, although maybe a simple macro could handle it.
A much harder simulation would be to actually simulate a 162 game season for every team, so that a win for one team is a loss for another and the teams average 81 wins. I’d use the actual schedule, the third-order win% as true talent and something like odds ratio or log5 (not exactly sure the details of each).
Now, not only would you then have a curve that fits each team, but you could actually repeat the simulation with an without a player. For example, sim the league a gazillion times with the Dodgers at a 90.5 win talent level (or whatever third order says) and then remove Manny’s 5 WAR to make them an 85.5 true talent level and see how much less often they make the playoffs with a gazillion more simulated seasons.
I’m not a programmer, but that actually doesn’t sound all THAT difficult, once you got the schedules set up.
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on May 16, 2009 10:10 PM EDT up reply actions 0 recs
The simulation idea sounds good
I could probably even do it on Baseball Mogul :). But then I’m not sure how I could generate an ordered list of who adds the most marginal playoff value. I’ll think of a way to do it, probably just using the binomial distribution. It would take awhile, but it should work out right.
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 16, 2009 11:54 PM EDT up reply actions 0 recs
Oh I get what your saying
that makes a lot of sense.
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 17, 2009 12:01 AM EDT up reply actions 0 recs
Sky
I’ve rethought this and it doesn’t sound that hard to do. Basically, I use 3rd order wins to estimate the teams true talent level. Then by using a binomial distribution, I create a curve showing the probabilities of the team winning a certain amount of games. Then I figure out the probabilities that the other teams in the division, don’t win as many games as the team.
So for example, if you say that the Rays, based on their third order record, had a 4 percent chance of winning 100 games, and the rest of the teams in the division had a 80% chance of not winning 100 games, that would mean the Rays had a 3.2% chance of making the playoff if they won 100 games. Then I would repeat that for each ammount of wins that the Rays could have, and then do the same process for every team… god that sounds hard.
Does that sound mathematically robust? It isn’t as precise as a sim, but I wouldn’t really know how to do one of those.
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 28, 2009 4:49 AM EDT up reply actions 0 recs
Sounds like we're on the same page.
It’s not great because it’s possible for all the teams in the division to win 90+ games (a win for one team isn’t necessarily a loss for another team) but yeah, it’s a lot easier than a sim. Not sure how you’d handle Wild Cards, either. Maybe if the team doesn’t win the division, what’s the probability no team from another division wins more games than them?
Beyond the Boxscore // Calling BJ Upton lazy is lazy.
by Sky Kalkman on May 28, 2009 12:39 PM EDT up reply actions 0 recs
It's not really a criticism.
It seemed as though you were implying that the contributions of CC and Manny were almost inconsequential when in reality, they were, in fact, the pieces that probably really put their two teams over the top. Could those teams have accomplished it without them? Potentially, but according to the data, both players were absolutely vital to the playoff berths. Neither team likely would have made the playoffs without those two.
I understand that you’re working from a probability aspect, but your position seemed to suggest that those two players weren’t worth much. Each added significant probability – especially considering the short amount of time for which they played on those teams – and each proved to be the “final piece” of the puzzle.
As an extended example, the Mets have 4 guys in the Top 10. Each of them higher than 29% playoff probability added. It’s a very weird thing to have four guys “combine” for over 115% playoff probability added. The value given to each represents the marginal playoff probability added as though he was the last player on the roster.
If the Mets lost one of those four players, the marginal value for each of the other three changes (could go up or down based on where the Mets are on the curve). Despite the fact that each player still (theoretically) contributes the same WAR, his playoff probability added changes because the loss of one players moves the team backward/downward on the curve.
By adding a player like CC or Manny, the Brewers and Dodgers jumped up the curve by about 4 wins. This intrinsically increases the marginal playoff probability added of each of the other players on the team.
by NoNameOnCard on May 16, 2009 9:44 PM EDT up reply actions 0 recs
It's dependent upon the team's location on the curve.
Because it’s a logistics curve, the removal of a single player can either increase or decrease the marginal value of the “next” player on the roster.
In the case of the Red Sox who are high up on the curve where each marginal contributes less to the playoff probability, without Pedroia, they drop to a place on the curve where marginal wins are worth more for playoff probability. This means that David Ortiz’s value goes up because Pedroia is gone.
In the same vain, for a team on the low end like the Blue Jays, the loss of Roy Halladay would have adversely affected the playoff probability added value of “next” player on the Blue Jays roster. The loss of Halladay would drop the Blue Jays into an area on the curve where marginal wins are worth considerably less for playoff probability. This would decrease the value of Alex Rios.
by NoNameOnCard on May 16, 2009 9:51 PM EDT up reply actions 0 recs
And if it's some kind of way to develop a value system for players...
It fails to acknowledge the playoff probability added to their old teams. CC had a contribution to the Indians, and Manny had a contribution to the Red Sox.
by NoNameOnCard on May 16, 2009 9:54 PM EDT up reply actions 0 recs
NoName
There are a couple of points that I want address.
It seemed as though you were implying that the contributions of CC and Manny were almost inconsequential when in reality, they were, in fact, the pieces that probably really put their two teams over the top.
I never said that. I just said that their contributions weren’t as big as other players, which is correct.
Because it’s a logistics curve, the removal of a single player can either increase or decrease the marginal value of the "next" player on the roster.
That’s pretty interesting, however, it’s irrelevant to this post. This was basically a “formula” to figure out the MVP if you want to go with the playoff angle, by asking contributed most to their teams chances of making the playoffs. I’m pretty sure that I answered that question right (although if you’ll note Sky’s and my conversation above, their are still some kinks that need to be worked out, namely strength of division).
It fails to acknowledge the playoff probability added to their old teams. CC had a contribution to the Indians, and Manny had a contribution to the Red Sox.
Once again, that’s irrelevant.
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 17, 2009 12:15 AM EDT up reply actions 0 recs
It looks like I chose to focus more on the methodology than the problem.
Instead of reading the meat of the article several times, I should have read the introduction better.
by NoNameOnCard on May 17, 2009 3:05 AM EDT up reply actions 0 recs
The Mets
have an amazing core group of players.
"I dunno. I never smoked any Astroturf"
-Tug McGraw
by squid92 on May 16, 2009 9:28 PM EDT reply actions 0 recs
I want Halladay.
Please, Riccardi, please?
by bs.uf15bosox9bears23 on May 17, 2009 1:32 AM EDT reply actions 0 recs
so do I
St. Louis Cardinals... defying win expectancy since 2008
by vivaelpujols on May 17, 2009 1:49 AM EDT up reply actions 0 recs

by 












BtB on Facebook















