Beyond the Box Score: An SB Nation Community

Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
New Blog: Along The Olentangy for Ohio State Fans!

MLB Strength of Schedule Estimates through 27 July

Sorry Lou--your team has had the weakest schedule in baseball.  And you're still 46-56.

Jonathan Daniel - Getty Images

Sorry Lou--your team has had the weakest schedule in baseball. And you're still 46-56.

One of the things I've long planned to include in the power rankings is a strength of schedule adjustment.  There's a big difference between playing in the AL East than...well...any other division.  Baltimore may not be a good team, but they probably look worse than they are because they play so many games against elite teams like the Yankees, Rays, and Red Sox.

Well, I finally have it going, and thought I'd give a preview of it here before posting the power rankings tomorrow.

First, the methods.  You can skip this and click "more" below if you just want to see the results!

 The approach is pretty straightforward.  First, I calculate the weighted average component winning percentage of each team's opponents.  This is basically the strength of schedule adjustment.  Face more tough teams, you'll have a higher opponent component winning percentage.  We can then use the log5 method (solving for W%(A)) to apply this adjustment to a team's raw component winning percentage and calculate an adjusted component winning percentage.  This adjusted component winning percentage should be a better estimate of a team's true performance, because it accounts for the fact that some teams have faced tougher competition than others.

There's one additional wrinkle.  As @cwyers pointed out to me on twitter, it is then possible--and desirable--to use this adjusted component winning percentage to re-calculate strength of schedule adjustments.  That way, your strength of schedule measures are based on a better measure of team performance than raw component winning percentages.  And, of course, once you get this new strength of schedule adjustment, you would want to generate new adjusted component winning percentages for teams...and you can repeat this cycle indefinitely.  I'm finding that after three iterations, you don't get much change, so that's what I'm doing.

...Ok, one last thing.  It is the case that a given team has a say in the performance of his opponents, though this effect on any one team should be small in most cases.  Nevertheless, because I'm pulling data from baseball-reference team schedule tables, I don't have the ability to account for this game by game.  So I opted to "regress" 10% back toward 0.500, reasoning that few teams have accounted for more than 10% of another's games played, and thus shouldn't drive more than 10% of the strength of schedule adjustment.  It's an imperfect solution to this problem, but it's the best I can do.

Make sense?  That's the methodology.  And now, at long last, here are strength of schedule (SoS) adjustments through 27 July--these are essentially measures of opponent winning percentage, as measured by the methods used in the power rankings:

Star-divide

Team SoS
Orioles 0.529
Diamondbacks 0.523
Mets 0.517
Indians 0.515
Marlins 0.514
Phillies 0.513
Royals 0.511
Mariners 0.506
Rockies 0.506
Blue Jays 0.504
Red Sox 0.504
Astros 0.503
Nationals 0.503
Angels 0.502
Rays 0.501
Braves 0.500
Dodgers 0.498
Pirates 0.498
Giants 0.495
Padres 0.495
White Sox 0.493
Twins 0.493
Tigers 0.490
Yankees 0.490
Cardinals 0.489
Brewers 0.487
Athletics 0.487
Reds 0.481
Rangers 0.478
Cubs 0.477

So the Orioles take the cake as having the worst schedule in baseball (big surprise!).  Other teams with tough schedules, at least thus far, include the Diamondbacks, Mets, Indians, and Marlins--all teams that have arguably underperformed at times this season.

On the other side of the coin are teams with particularly weak schedules.  These include the Cubs (no excuses!), Rangers, Reds, A's, Brewers, Cardinals, and Yankees.  As you can see, while the pattern is not absolute, a number of the "surprise" teams (Rangers and Reds first and foremost) have had fairly easy schedules thus far.  You can also see that the NL Central seems to be a good place to play--four of the six easiest schedules belong to teams from that division...because there are a lot of bad teams in that division, and no really outstanding ones!  I haven't looked closely, but I doubt the Reds' and Cardinals' schedules will be much worse moving forward.  The Yankees were a surprise here, but while they do play the Red Sox and Rays a lot, they have otherwise had a fairly light schedule...including 12 games vs. Baltimore, their most common foe thus far.

Finally, if you look closely, there's an interesting pattern here where many of the best teams in the standings have tended to have weaker strength of schedules.  An obvious reason for this is that they don't have to face themselves!  The correlation isn't huge (r = -0.32), but it's there.  This is one reason the iterations are an important addition--without the iterations, the correlation was closer to 0.6.  But, of course, another possibility remains--that part of their success is just the good fortune to have an easy schedule.  We'll see what happens over the rest of the season.

Anyway, hope you like this!  I'll show how these values are incorporated into the power rankings tomorrow.

0 recs  |  Comment 13 comments |

Story-email Email Printer Print

Comments

Display:

Good post

Somehow the fact that the Yankees have one of the easier schedules makes me hate them even more.

Have you thought about using preseason projected winning percentages, or updated projections to do this analysis?

by vivaelpujols on Jul 29, 2010 2:31 AM EDT reply actions  

Ditto on the Yankees

It also might mean that they stand to lose a little ground over the remainder of the season, as I have to expect that their proportion of games against the other non-BAL AL East teams will increase as the season rolls on. MLB always saves a bunch of Yankee/Red Sox games for September.

Re: projections—I’ve thought about incorporating preseason projections into the power rankings to get a true talent estimator number to go with TPI. Hadn’t thought about using those data for the strength of schedule stuff, but there’s no reason I couldn’t….if I ever get around to using preseason projections. ;)
-j

by JinAZ on Jul 29, 2010 8:25 AM EDT up reply actions  

How much it matters...

Using the log5 method (which I’ve heard about, but hadn’t check out until now), I wanted to know how much these differences in SoS mattered. If you put a .500 team against a .470 schedule over a full season, they’ll win 9.7 more games than playing against a .530 schedule (which is just slightly wider than the range in Justin’s table). At this point in the season, it’s about a six game difference.

W%(A v. B) = W%(A))/(W%(A)(1 – W%(B)) + (1 – W%(A))*W%(B))

http://www.tangotiger.net/wiki/index.php?title=Log5

by Sky Kalkman on Jul 29, 2010 10:48 AM EDT reply actions  

Wow, that's pretty huge.

I’d guess that SoS will improve as time goes on. Baltimore can’t have many more games vs. the Yankees, for example, so their schedule will get easier. The Cubs…well…the Cardinals are the only genuinely formidable opponent in the NL.

Still, +/- 3 games is a pretty big deal.
-j

by JinAZ on Jul 29, 2010 10:53 AM EDT up reply actions  

Also, in case someone wants to do log5 backwards like I did above

…and doesn’t want to test their algebra skills (mine were pretty rusty), here’s the log5 equation solved for W%(A):

A = Bx / (1-B-x+2Bx)

Where:
A = W%(A)
B = W%(B) <—-Strength of Schedule in the calculations I did above
x = W%(A v. B)
-j

by JinAZ on Jul 29, 2010 11:06 AM EDT up reply actions  

Greed mode:

Any chance of computing strength of schedule in a team’s remaining games?

Any chance of incorporating home/away differentials?

by Sky Kalkman on Jul 29, 2010 10:53 AM EDT reply actions  

Probably not anytime soon

I can see pulling #home vs. #away games, and using that to modify the SoS due to home team advantage. I’m not feeling real motivated to do that right now, though, as I think it would matter very little at this point in the season. Maybe next year.

SoS over remaining games is possible to do thanks to the remaining games table at B-Ref. But it’s hard to automate. Those tables on that page are pure text, and they are hard to get to automatically format well using text-to-columns. Plus, the Vlookuping and Hlookuping it would require to make it work makes my head hurt…
-j

by JinAZ on Jul 29, 2010 11:03 AM EDT up reply actions  

Yep

They are pretty hard to work with in Excel. I learned that when I tried to convert them so I could run sims on the rest of the season. That’s the primary reason I haven’t done more iterations.

by stevesommer05 on Jul 29, 2010 11:31 AM EDT up reply actions  

Very nice, thanks!

I would also add that another reason why the good teams have weaker schedules and vice-versa, something most people miss when doing this type of analysis: the good team’s record against the teams in their division.

I noticed this with the 49ers, the analysis would talk about how poor the records of the teams they had played are, but when you are 12-1 or whatever, that means the teams you played are 1-12 against you, and take that immediate hit.

Not sure what the best way to adjust this is. For my football example, I would take out the 49ers record against those teams. For baseball, since this is different in that you are assessing remaining games on the schedule, that might not work, but is one option.

Adoptive parental unit of Ehire Adrianza.
Godfather of Travis Ishikawa.

"Woo hoo!" - Tim "The Kid" Lincecum
"The objective is that World Series ring" - The Kid
"I think my role here has changed a little bit. I'm counted on a little more." - Posey after hitting 12-24 with 4 homers after Molina trade

by obsessivegiantscompulsive on Jul 29, 2010 1:42 PM EDT reply actions  

Yeah, this is a problem

This is why I pulled back the SoS estimates by 10% toward 0.500 (it’s mentioned up there in that long post). Unless I were to do it with gameday or something and pull out statistics from games involving a team when calculating that team’s opponent w% (which would also require that I come up with my own fielding metric!), I can’t really fix this in a meaningful way. But at least we know that 10% is roughly the upper bound (it’s actually ~13%) of how many games one team can play against any other one team, so this will hopefully help correct for this. A little bit.

Fortunately, after doing a few iterations, the correlation is fairly weak at 0.3. It had been 0.6 or so, which struck me as scary-high. But as it is, I think we’re ok.
-j

by JinAZ on Jul 29, 2010 3:29 PM EDT up reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?
Start posting on Beyond the Box Score »

Join SB Nation and dive into communities focused on all your favorite teams.

Connect_with_facebook

FanPosts

Community blog posts and discussion.

Recent FanPosts

Baseball_small
WAR By Decade: 1871-1879
Prosser_small
Cliff Lee: No longer invincible
Paige_small
Kelly Johnson Cleared Waivers; I Think That's Weird
Jeter_06_world_series_small
Top 10 players to start a franchise with revised.
Ballgame_2006_vs_texas_revised_small
The Myth of the Spoiler Returns
Small
Denard Span's Strikezone
Small
Matusz: Danks 2.0
Paige_small
I Think I Offended Juan Pierre
Leopold_butter_scotch_southpark_small
HOF/PED Quandry
Small
The Power Rank

+ New FanPost All FanPosts >

Sign up for the BtB Newsletter!

FanShots

Quick hits of video, photos, quotes, chats, links and lists that you find around the web.

Recent FanShots

Plate discipline trends
What's Wrong With Mike Pelfrey?
Lightest Players in History (min 1000 PA or 500 IP)
Statistical Head Scratchers: The Sacrifice Fly
Adam Wainwrights Curve
Jose Batista Facts
A PitchFX look at how R.A. Dickey is able to change speeds with his knuckleball to be so effective
Out Rate: a simple new upgrade on OBP
Tommy Hunter vs. Scott Feldman
Does anybody know of somewhere you can download up to date pitch-by-pitch...

+ New FanShot All FanShots >

BtB on Facebook

BtB on Twitter

RSS Feed: @BtBScore

Sky: @BtB_Sky

Jeff: @jeffwzimmerman
Steve: @steve_sommer
Dan: @dturkenk
Harry: @harrypav
Jinaz: @jinazreds
Jack: @jh_moore
Tommy R: @trancel
Justin: @justinbopp
Satchel: @SatchelPrice
Adam: @baseballtwit
Larry: @wezen_ball
Peter: @CapitolAvenue
Paul: @TheDiaTribe
Daniel: @CamdenCrazies
Matt: @devil_fingers

SBNation.com Recent Stories

Chicago White Sox's Mark Teahan is congratulated by Gordon Beckham (15) after scoring on a single by A.J. Pierzynski in the second inning of a baseball game against the Detroit Tigers Monday, Sept. 6, 2010 in Detroit. (AP Photo/Duane Burleson)

White Sox Win Seventh In A Row On A.J. Pierzynski's 10th-Inning Single

Colorado Rockies' Carlos Gonzalez is congratulated in the dugout after scoring against the Cincinnati Reds in the third inning of a baseball game at Coors Field in Denver on Monday, Sept. 6, 2010.  (AP Photo/ Matt McClain)

Carlos Gonzalez, Rockies Stay Hot In Holiday Defeat Of Reds

NEW YORK - JULY 18:  Andy Pettitte #46 of the New York Yankees bends over prior to leaving the game in the third inning against the Tampa Bay Rays during the first inning on July 18 2010 at Yankee Stadium in the Bronx borough of New York City.  (Photo by Jim McIsaac/Getty Images) +6 updates

Andy Pettitte Reporting To Minors For Rehab Start Following Incident-Free Bullpen

More from SBNation.com >


Managers

Limes_125_small Sky Kalkman

Wbc_029_small Jeff Sullivan

Editors

Rawlings_baseball_bigger_small Dan Turkenkopf

Dayton_small Jeff Zimmerman (TucsonRoyal)

Aviles_small Justin Bopp

Paige_small Satchel Price

Authors

Jinaz-reds-avatar_small JinAZ

Face_small Harry Pavlidis

Newavatar_small Matt Klaassen

Wezenball-logo_small lar

Big_pun--300x300_small Tommy Rancel

Adam_small adarowski

Redcap_small SFiercex4

St_louis_cardinals_ce1141_003263_small stevesommer05

Small garik16

Julio_teheran_2_small PWHjort

Cclogo_small Daniel Moroz

Closeup4_small J-Doug

Nick_cage_small The DiaTriber