A Quick Word on Home Field Advantage and the Strike Zone

A new book called Scorecasting makes the claim that the primary source of home field advantage is a bias in the behavior of officials. To quote this interview at (h/t Rob Neyer):

So we looked at how games are called and that’s where the data went berserk.... called balls and strikes in baseball — that’s where we saw games are called totally differently based on where they’re played. And the more attended the games are, the more striking the bias.

How much sense does this make? It turns out it's rather plausible, if not completely accurate.

Now, there's a reason why my Benefit of the Doubt series has yet to discuss home field advantage. That reason is that I haven't found that difference particularly striking, either in terms of Zone Advantage, differences in the expected strike zone, or the coefficients in my ever-growing ordinal logistic model.

To be clear, yes, home field advantage is statistically significant when it comes to called pitch bias, but I haven't yet found it to be as practically significant as other factors. Relative to other non-control variables in my model, I've found it to have a medium-to-low impact on umpire bias.

According to my findings, the effect of home field is far less than other factors, including the distance between the boundaries of the legal zone and the location of the pitch, the differential between in-zone and out-zone pitches over the course of an at bat, pitcher/batter handedness, and the run expectancy according to the ball-strike count (careful drawing any conclusions directly from the link to my model above, as those are unit coefficients, not standardized odds ratios). I have not yet controlled for home team attendance.

Mr. Neyer adds this in his coverage of the Wired interview:


I'm looking forward to seeing the book (soon, I think) and the data. But this does reinforce something that's been floating around in my head for a long time (because I read it somewhere): Most (if not all) of the home-field advantage is the result of the officials subconsciously wanting to avoid getting yelled at by the fans. And it wouldn't take much to cause a measurable effect. There are how many pitches in the average baseball game? Something like 300? And how many of those could reasonably be called a ball or a strike?

Well, first of all, of the average 300 pitches per game, only ~150 of them are called strikes. Of these, I find that ~85% of calls are correct according to PitchF/X. That leaves ~23 pitches in a typical 9 inning game that might have an effect on home field. While I don't have the numbers available at just this moment, I do recall that the split between home- and away-team bias in these numbers is notable but small.

However, even if it is small, it turns out that blown calls have a significant effect on run expectancy over the course of the game--especially on pitches late in the count. In fact, a single blown call with a full count can swing the run expectancy by a full 6/10 of a run! But how much of the the +/- 4% MLB field effect is this responsible for? Again, I don't have the data immediately at hand, but I should have it available soon.

Stay tuned, sports fans (and Mr. Neyer, if you're reading this).