When it comes to judging baseball talent, one approach is to start at the individual level and build up to team performance. This can lead to confusing or incorrect results at the team level, though, as often the individual statistics contain remnants of team-wide phenomenon. Another approach is to start at the team level and divvy up credit or blame to individual players according to some algorithm. Neither approach is perfect, and the results will never agree completely. Today's box score examines some of the differences and similarities between these two approaches.
Table of Contents
The difference between WAR (wins above replacement) and WSAB (win shares above bench) nicely encapsulates the two approaches sketched above.
WAR is a system based on wOBA and a defensive metric (FanGraphs uses UZR, whereas Rally's WAR uses Total Zone). Because wOBA uses batter linear weights and UZR rewards individual fielding regardless of team performance, WAR is a very good example of the first approach: it looks at individual performances, which can be used to evaluate a team as a whole. For an explanation of how WAR is calculated, Tom Tango who deserves much of the credit for developing it, explains here.
WSAB, however, is an entirely different beast. WSAB is a modification of Bill James Win Shares metric, which starts with the number of games a team actually won, and then distributes those wins (more specifically, "win shares," which are one-third of a win) to the players on the team. Win Shares distributes the wins at set ratios among hitting, pitching and defense. Because Win Shares do not account for replacement level (a concept we've discussed recently) in any way, Dave Studeman of The Hardball Times developed Win Shares Above Bench, which adds a replacement level modification. You can find an explanation of the WS to WSAB modification here.
So there are two approaches. How well do they match up? According to Dave Studeman, creator of WSAB, there is an R-squared value of .96, which is quite high. However, there is a caveat:
[T]he difference between WAR and WSAB can matter a lot for individual players. Also, the variance between the two is greater for individual seasons, as you would expect.
Tom Tango adds:
All to say that you can take the absolute best, and a half-decent measure, and you will end up with virtually the same thing. There’s only so much you can twist and turn performance stats to get anything truly insightful. That’s at a career level. At the individual seasonal level, things are different.
By "absolute best," Tom is referring to his own statistic, which goes a long way toward describing how dedicated people are to their own sabermetric babies. And of course, these things wouldn't be nearly so fun if they weren't. It reminds me of the old saying about academia:
The arguments are so fierce because the stakes are so small.
But of course I'm sure you all have your own favorites.
I don't have anything against our subtitle here at Beyond the Box Score ("A Saber-Slated Baseball Community"), because it's descriptive and straightforward. But I think the prize for best subtitle of any SB Nation blog has got to go to McCovey Chronicles:
In play: Out(s)
If you don't get the reference, you aren't watching much baseball on MLB's Gameday. But the fact remains, the Giants not only have a miserable excuse for an offense, they also sport one of the best pitching staffs and team defenses. For proof, Bay City Ball graphed team FIP versus team UZR. It's a graph with four quadrants. In the top left quadrant are teams with both good defense-independent pitching and good fielding, like the Giants. At the opposite end chart, in the quadrant corresponding to poor defense and poor pitching, are the Orioles. It's an interesting way to visualize the shape of a team's run prevention.
What about teams that confuse different defensive metrics. Take the Dodgers, suggests Cy Morong:
Using data from ESPN, the Dodgers have allowed 452 runs this year with 422 of them being earned. So they have 30 unearned runs. The league average is 40. So that makes the Dodgers 10 runs better than average.
Then ESPN shows that the Dodgers DIPS% is 107, meaning that their pitchers would have an ERA that is 7% higher than it actually is if they allowed a league average of hits on balls in play (they are , of course, better than average). With their actual ERA being 3.61, then their DIPS ERA is 3.86. So here their fielders save .25 runs per game (that is, if the pitchers have nothing to do with batting average on balls in play). The Dodgers have played 115 games, so this is an additional 28.75 runs scored. Adding the 10 in from fewer unearned runs gives us 38.75 runs. Since it usually takes about 10 runs to win one game, a rough estimate is that the Dodgers have won close to 4 games this year with their fielding.
He also notes that by FanGraphs' team defense statistics, the Dodgers are a middle of the road team (their collective UZR/150 is just -1.3, 13th in the majors). This is an illustration of the divergence of team-denominated statistics, and individually denominated statistics. It seems that either the team-based statistics overvalue the Dodgers, or otherwise the individual statistics undervalue the Dodgers. It is interesting to note that balls in the Chavez Ravine do not carry nearly as much as balls in other parks (nearly 2% less than average, or maybe eight feet on a home run). The different environment may be impacting the team statistics.
This analysis shows that with a r-squared of .46, about 46% of a teams' closers ability to get saves depends on the number of games the team is above .500.
Down but not out, I decided to see what would happen if, instead of looking at a team's ACTUAL wins, I looked at a team's expected wins. For expected wins, I used PECOTA projected standings for the years 2006 through 2008 (2006 was the first year that Nate Silver included reliever leverage into his projections). Here's a chart of the reults (click to enlarge):
The r-squared, as you can see, was just .08, suggesting that projected wins above .500 has little predictive value for number of saves. If you are curious, the two outlier seasons are K-Rod's record breaking 66 year last year, and the 2006 Cleveland Indians, who had just 24 saves. All I'm saying is that Joe Nathan is money.
Or the Adrian Beltre horror show. In case you didn't hear,
the Gold Glove infielder is out indefinitely, and could need surgery because of bleeding in a testicle.
By the way, if any of you are looking for 130 copies of Prince's album "Emancipation," (and really, who isn't?) I recommend you hit up this guy. He is willing to barter for sports cards. Forget sports cards, I think I might give him sports cars a bounty so plentiful.
So the question of the day is how do we decide when to start our analysis at the team level and when to start it at the individual level. It obviously makes a difference in some cases, especially for defense. But in other areas, like saves (nudge, nudge), team context matters less.
What do you think?