Sky's note: I'm excited to announce that Justin's agreed to join the BtB team. He thinks it's on a limited basis, but we'll convince him otherwise. You're probably already familiar with Justin's work from his blog, On the Reds, which is now Basement Dwellers (and my new favorite stathead pun).
If you were to try to rank all MLB teams, how would you do it? W/L records? Runs scored and runs allowed? WAR? I decided to put together a power ranking of MLB teams, and Sky has been kind enough to agree to publish it here at BtB.
Detailed methodology follows the rankings and commentary, but very briefly: I estimated team runs scored and team runs allowed based on team hitting, pitching, and fielding statistics. I then used the Pythagorean formula to estimate a team winning percentage, after a league adjustment. Therefore, actual team wins, losses, runs scored, and runs allowed are not used in these rankings. You can interpret the estimated winning percentage reported below as the winning percentage we'd expect teams with these hitting, pitching, and fielding statistics to produce if we threw them all into one big league and let them battle it out. They are NOT forecasts, but rather an alternative look at what teams have done thus far.
With that preamble, here are the team rankings (table is sortable by clicking in the column headers; asterisks indicate park-adjusted data; "e" stands for "expected"):
Top AL offenses (wOBA*): Yankees, Rays, Rangers, Red Sox, Indians
Top AL pitching (FIP*): Royals(!), White Sox, Tigers, Blue Jays, Mariners
Top AL fielding (bUZR and THT): Rangers, Tigers, Blue Jays, Rays, Mariners
Top NL offenses (wOBA*): Dodgers, Phillies, Nationals, Mets, Cardinals
Top NL pitching (FIP*): Braves, Dodgers, Mets, Cardinals, Rockies
Top NL fielding (bUZr and THT): Brewers, Pirates, Reds, Phillies, Diamondbacks
"On Paper" Division Leaders
AL East: Blue Jays
AL Central: Tigers
AL West: Rangers
AL Wild Card: Rays
NL East: Mets
NL Central: Cardinals
NL West: Dodgers
NL Wild Card: Braves
Comments and methods below the fold:
Five teams of note:
The Dodgers haven't missed a beat since the loss of Manny, though Juan Pierre's .435 wOBA has helped them absorb the loss of offensive production. They may be a tad "lucky" in terms of runs allowed (186 actual vs. 197 expected), but this has been a team with excellent offense, excellent pitching, and even roughly average fielding. None of the other teams in the NL West have a .500+ record...
Tampa Bay Rays
The Rays come up #3, despite a sub-.500 record. But look at their component statistics! Best park-adjusted expected runs scored in baseball. Seventh-best park-adjusted expected runs allowed, largely due to their number seven ranked fielding. They have played outstanding baseball thus far, and the Blue Jays--who have also been superb--will have to worry about this team as the season wears on.
While not contenders, the Indians probably aren't as bad as they look. Their defense (pitching + fielding) has been dreadful, but their offense has produced the fifth most expected runs in baseball. These data put their expected runs scored almost exactly the same as their expected runs allowed, which would make them a .500 team (and a .500+ team after accounting for the fact that they play in the AL).
Yet another underperforming team--though you'd almost have to underperform to get a 13-32 record. The Nationals' offense has been very good thus far thanks to Ryan "Mr. Streak" Zimmerman and the newly acquired Adam Dunn. But they've allowed the most runs in baseball, after park adjustments. In a lot of ways, they remind me of some of Dunn's previous Reds teams--great offense, terrible defense.
San Francisco Giants
We can't focus exclusively on the underperformers, so here's a team that looks like an overperformer. The Giants come in dead last in our ranking, despite a true record just shy of .500. This team has the worst offense in baseball, which scores so few runs that even their very good pitching staff can't keep them from being outscored. What is happening here, though, is that their expected runs scored is massively below their actual totals (18 run difference). And similarly, their expected runs allowed is 21 runs higher than their actual runs allowed. This team may be lucky to stay out of the dungeon by season's end.
Here is a run-down of the methods behind the table above.
Estimated team runs scored (eRS) is calculated using wRC, which is a linear-weights estimate of absolute runs scored and pulled from FanGraphs (I heart FanGraphs). I adjust this number for park effects using Patriot's 5-year regressed park factors.
Team defense is comprised of pitching and fielding, each of which is estimated independently. Pitching is estimated as FIPRuns, which is a simple modification of FIP to yield an estimated runs allowed total (details here; tRA* would be better, but Excel can't pull it from statcorner). I park adjust home run rates using Patriot's HR park factors, and do not include intentional walks in the walk totals. Fielding is the average runs saved estimates of bUZR (from FanGraphs) and the batted-ball team fielding statistic from Hardball Times (converted into runs, assuming 0.8 runs per play saved). So, estimated runs allowed (eRA) = FIPRuns - FieldingRuns.
Update (5/29/09): Thanks to the discussion below, I decided to switch from using FIP to Graham MacAree's tRA to evaluating pitching. I think this is an improvement, in that it allows us to better recognize everything that a pitcher can potentially control while still separating pitcher performance from fielding performance. Future rankings will use tRA, but the above numbers have not been updated. In all honesty, it ends up not making a huge difference in most cases.
One final adjustment is needed, though--the American League features a higher level of competition than the National League, thus we need to give a bonus to AL teams and a penalty to NL teams if we are to rank them against each other. How? If you look over the past 5 years (2004-2008), AL teams have dominated interleague play (702-557, 0.558 winning percentage). I'll spare you the math (I used the Odds Ratio and PythagoPat), but it turns out that if you give a 22.5 run per season bonus to an average AL team offense *and* defense, and a 22.5 run per season penalty to an average NL team offense *and* defense, you can very accurately predict the AL's winning percentage in interleague play. So, I am applying these adjustments (pro-rated by games played--the adjustment is 6-7 runs at this point) to teams prior to calculating expected winning percentage.
(Note: as a check, we usually assume that replacement players will hit five runs better vs. average in the NL vs. the AL per 700 PA. The average team in 2008 had 6254 PA, so 6254/700*5 runs = 45 runs. I'm essentially splitting this difference evenly between the AL and NL offenses. We assume a similar adjustment for pitchers, and thus defense as well as offense receives this adjustment).
So, expected winning percentages for each team are calculated according to PythagoPat:
[(eRS+lgadj)^K + (eRA+lgadj)^K]
eRS = estimated runs scored
eRA = estimated runs allowed
lgadj = pro-rated league adjustment (positive for AL, negative for NL)
K = PythagoPat coefficent