Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: NFL Players Ready To Welcome Gay Teammate

Strike Zone a Marginal Component of Home Field Advantage

Home-vs

Author's Note: Special thanks to the commenters (Dan, Mike and Xeifrank) in my previous post who helped me get the answers I needed.

As promised in my previous post, I ran some numbers to double-check the claim by the authors of Scorecasting that a home field bias in the strike zone is a significant contributor to MLB home field advantage. A look at the heat maps above is inconclusive: the away team pitcher's heat map is actually the "hottest" (indicating an away-team bias for pitches thrown down the chute), but banding is wider for the home team, indicating a home team advantage on pitches.

Indeed, the home team undoubtedly has an advantage when it comes to the strike zone, but this advantage just isn't very big. This is true when we look at the home/away splits on their own as well as when we control for several other factors.
Home Team Runs/150 Zone Adv.
Pitching -0.400 2.43%
Batting -0.281 1.97%
-0.119 0.45%

Star-divide

I assigned a run value to each "blown" call thrown during the regular season from 2008-2010, using Tom Tango's linear weights per ball-strike count. As noted in the table above, I find the home vs. away spread to be -0.119, or about +/- 0.06 runs, per 150 called pitches (equivalent to the average nine-inning game). This is exactly the figure Dan Turkenkopf offers in the comments of the previous post.

Considering that the run environment over this period is about 9.1 per game (according to Baseball-Reference.com), and considering that the typical MLB field effect is +/- 4%, home team bias in the strike zone only accounts for ~16% of the observed effect. This is pretty darn close to the findings of Mike Fast and Phil Birnbaum, also noted by Dan in the comments of the previous post (Phil takes on Scorecasting from a slightly different angle on his blog).

For those of you scoring at home, that advantage is equivalent to playing an entire season at home and barely winning one extra game.

Subjectively, this is not my idea of a big deal. It's certainly not enough for the data to go "berserk," as the author claims in this interview with Wired. But there's another objective measurement that indicates how unimportant home field advantage is regarding the strike zone.

As I've noted several times in this series, I base much of my work on a statistical model predicting which variables have an effect on called pitch bias. This model not only tells us which variables make an impact, and in what direction, but how strong that impact is relative to the impact of the other variables in the model. So how strong a predictor is home field advantage on Zone Advantage?

Among 29 non-control, statistically significant variables, home field ranks dead last.

That's right, 29th out of 29. Relative to other variables, it's about 1/4 as important as pitcher-handedness, 1/5 as important as velocity, 1/8 as important as pitch type, 1/10 as important as the run expectancy of the base-out state, 1/57 as important as the run expectancy of the ball-strike state, and nearly 1/100th as important as the ratio of pitches thrown in or out of the legal zone over the course of an at bat.

Of course, at this point, we're beyond the scope of the book; this has nothing to do with home field advantage at the level of win-loss record. It does show, however, just how unimportant home field advantage is in terms of strike zone bias.

Rob Neyer wrote:

I still want to see the numbers. And it's not like nobody's ever studied this stuff before.

Mr. Neyer's right: it's not like nobody's done this before. Dan, Mike and Phil have crunched the numbers, and my findings here replicate and corroborate them. So there you have it.

PitchF/X data originate from Darrell Zimmerman's SQL-based PitchFX database, run expectancy data by Tom Tango at Inside the Book.

 

Previous episodes in the Benefit of the Doubt series:

Comment 33 comments  |  3 recs  | 

Do you like this story?

Comments

Display:

Love it

Great example of the practical application of the scientific method:

-Person A declares a hypothesis, runs a study, and publishes the results
-Persons B, C, and D replicate the study and determine that the findings need to be qualified
-Hypothesis refined, knowledge advanced

by Bill Petti on Jan 30, 2011 10:22 AM EST reply actions  

J-Doug

Here to play devil’s advocate, because I think this is great.

It seems as though there could be some bias in the data. If the home field advantage bias is apparent and known by the players, this could affect their swing rate or the home team pitcher’s approach to pitching.

For example, if a Road Team Pitcher knows he won’t get that call on the corners, perhaps he throws it down the middle more often. If that were true, the increase in whatever offensive production statistic you want to lose would seem to be increased, but not attributed to the bias in your model. And you would see many less opportunities for the umpire to blow any calls.

Similarly, if a Road hitter up to bat may know of a bias and swing more at pitches outside the zone, again reducing the possibility of a blown call against the Road Team.

Similarly, the opposite would happen for the Home Team players, ultimately reaching some equilibrium (or near it). In that case, the model would fail to show the effect, no?

by BMMillsy on Jan 30, 2011 12:44 PM EST reply actions  

I agree

I would not be surprised to see this happening. We know that BABIPs are higher for the home team.

The data aren’t so cooperative, however. There’s very little split in the number of pitches thrown inside and outside the legal zone—the home team sees 45.4% and the away team 45.2%. I also find that the home team gets pounded inside more but doesn’t see more pitches off the outside corner.

Either way, the problem is this isn’t the finding that’s reported in Scorecasting. I have to investigate further, but it seems they claim specifically that there is a significant bias in home vs. away plate calls and that this is a significant component of the home field advantage. I could be interpreting this wrong, but that seems to be what they’re saying.

Personally, I think the bias may have a lot more to do with calls on balls in play, pickoffs and stolen bases. We know that foul calls are significantly biased in favor of the home team in college basketball. It would not surprise me if there were a significant effect going on here.

And, of course, on most balls in play, a bias in the call would have a far stronger effect than the bias on called pitches, and with a smaller sample size would be less likely to even out over the same period of time.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Jan 30, 2011 1:19 PM EST up reply actions  

I like the way you are thinking here...

…that there may be a feedback at work here where player adjust their behavior because they are aware of the known bias and that leads to worse performance than we might naturally expect. But I think J-Doug handles the issue below.

This made me think of the larger issue of causal mechanisms—we know there is a home field advantage, but what we don’t know are the mechanisms through which that advantage comes to be. Even if the authors had been right that umpires exhibited a significant bias towards the away team the question would still remain, why? They assume it is because umpires don’t want to draw the ire of fans, but the only way to prove that would be through qualitative research, not quantitative. Right now, it’s just a hypothesis like any other.

by Bill Petti on Jan 30, 2011 1:44 PM EST up reply actions  

Agreed

Although we could get closer. If there’s an attendance relationship, then that’s corroborating evidence. Perhaps we can get on-field decibel measurements for each pitch? I’d love to play with that data.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Jan 30, 2011 2:46 PM EST up reply actions  

Right, except that...

a correlation with attendance could just as easily be evidence that players perform worse under hostile conditions. But if we’ve already debunked that and the only left is ump calls I can see where this would help.

Motivation, in any case, is really hard to tease out—for me, all quant and no qual doesn’t really get you where you want to go.

by Bill Petti on Jan 30, 2011 2:54 PM EST up reply actions  

THT Annual 2011

I know it’s a book and all, but this was covered very well by John Walsh in the 2011 THT Annual. John found that about one-third of the home field bias is due to the home plate umpire.

by studes on Jan 30, 2011 2:58 PM EST reply actions  

Dan Turkenkopf actually published on this well before John did

And he found about half the effect that John did, at 0.06 runs per game. Since J-Doug now found the same as Dan by an independent method, I wonder if John made a mistake in his calculations.

Winner, Beyond the Box Score 32 Predictions Contest, 2009

by Mike Fast on Jan 30, 2011 3:21 PM EST up reply actions  

OK

Mike, I’m not claiming a “first mover” advantage here. Just pointing out that someone else, a pretty well-respected researcher, has covered the subject too. I’m surprised that you’d wonder if John “made a mistake,” just because two (actually three) researchers came to different conclusions. Might we not want to study the subject a bit more and compare approaches?

by studes on Jan 30, 2011 3:57 PM EST up reply actions  

Might we want to study a bit more? Absolutely.

That’s why i said I wonder, not that I know. But it’s not as if three people studied and came to three different conclusions. Dan and J-Doug both found 0.06 runs/game and John found 0.14 runs/game. As best I can tell, Dan and John used similar methods and similar (same?) definitions of the zone, and J-Doug used a different method.

Is it worthy of further investigation? Definitely.

Winner, Beyond the Box Score 32 Predictions Contest, 2009

by Mike Fast on Jan 30, 2011 4:11 PM EST up reply actions  

Now that I read John's article a bit more closely

I see that he restricted himself to 0-0 pitches and extrapolated that effect to the full game. I’m extremely skeptical of the accuracy of that approach.

Winner, Beyond the Box Score 32 Predictions Contest, 2009

by Mike Fast on Jan 30, 2011 4:24 PM EST up reply actions  

reduces other factors

As John says, we know that strike zones vary based on the count, which is why he restricted it to 0-0 (the largest sample). It’s fine for you to be “extremely skeptical”, but it sure does help if you say why. I would assume your skepticism implies that you feel that:

1. You feel the sample size is too small
2. The home field variance differs depending on the count

The inverse skepticism would be to ask whether J-Doug or Dan normalized for count.

by studes on Jan 30, 2011 5:18 PM EST up reply actions  

Not sure what you mean by normalized in this context

But I have run two ordinal logit models. One is for all pitches and the other is for 0-0 counts only.

The home at bat variable shows almost exactly the same coefficient in both models: http://www.rationalpastime.com/2010/11/tech-notes-pitch-characteristics-and.html

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Jan 30, 2011 5:25 PM EST up reply actions  

Thanks

Thanks, J-Doug, for including that link. I have asked John to stop by if he can, but lives on the other side of the world (almost) so I’m not sure when he’ll show up.

by studes on Jan 30, 2011 5:32 PM EST up reply actions  

Tuesday

John just told me that he won’t be able to stop by until Tuesday.

by studes on Jan 30, 2011 6:41 PM EST up reply actions  

Well, if you want my honest opinion

I think that very little of what has been published on the strike zone reflects to the reader just how little we actually know about it.

So I’m skeptical of everything that Dan, John, and J-Doug have published on the topic, and I don’t think any one of them have hit on the “truth”. I’ve studied the strike zone a lot, and I know I haven’t.

But given that proviso, if I had to choose between a study of 0-0 counts extrapolated to all counts and two studies of all counts, the studies of all counts would seem to have more validity. And yes, that implies that home field variance differs depending on the count. I don’t know how comfortable I am with that, but in the absence of evidence to defend John’s assumption, that’s what I’m left with at the moment.

I don’t think we understand enough about the strike zone by count to go normalizing anything by it yet.

Winner, Beyond the Box Score 32 Predictions Contest, 2009

by Mike Fast on Jan 30, 2011 5:27 PM EST up reply actions  

thanks

Yes, Mike, I really do want your honest opinion. Does anything I say imply otherwise?

I agree with your overall point, which is that we don’t really know what we don’t know here, which is why I think we should be more open and highlight differences in outcome, rather than expressing skepticism that one study doesn’t match two others. It’s too early to choose sides. I’d rather have more approaches and different conclusions.

by studes on Jan 30, 2011 5:30 PM EST up reply actions  

That comment wasn't directed at you

But at the broader sabermetric/baseball community. It is VERY popular to highlight anything that shows that the umpire doesn’t know what he is doing or is unfair and ought to be replaced by a machine.

And for me to say, “Well, I’ve looked at all this, and more than anything I am just befuddled and pessimistic of my/our ability to understand what is really going on here” doesn’t get much airplay. Understandably so. I’ve pointed out some questions I have in the comments to John’s and J-Doug’s articles on the topic, but that’s not nearly as convincing as coming to my own conclusions and publishing on them. I wish I could get far enough with the data that I had some conclusions I felt comfortable with.

I don’t particularly agree with your characterization of the studies so far as all being on equal footing. Given the data we have, I think it’s quite fair to characterize Dan and J-Doug’s finding as the default position of the moment. In no way, however, do I think that this is the final word on the matter. Could John be proved right and Dan and J-Doug wrong? I suppose. More likely, I think they’ll all be proved wrong. But if the current view of how the zone is thought of turns out to be true, then I think Dan and J-Doug’s work here will probably prove out as well.

Winner, Beyond the Box Score 32 Predictions Contest, 2009

by Mike Fast on Jan 30, 2011 5:39 PM EST up reply actions  

fair enough

I’m not as quick as you to rush to conclusions about which studies may be more valid. I’d rather wait for more analysis, having identified key differences between studies. I certainly wouldn’t wonder if someone had made a mistake. It doesn’t seem to me that the differences here are THAT stark (unlike the differences with the Sportscasting writers).

Stating the obvious, if umpire bias differs over count (which we know), and umpire bias also differs home/away (which we also know), then accounting for the differences when counts differ by home or away is not a straightforward task.

Also, you home in on this issue as the key difference between the studies. I’m not sure that’s true. The strike zone determination, and the run value calculations, may also be significantly different. I’m not smart enough to figure that out.

by studes on Jan 30, 2011 6:47 PM EST up reply actions  

I'm satisfied with this characterization

I know that as I continue to expand my model that I find new things I had not yet anticipated, and some earlier findings were rejected. Nature of the business I suppose.

The problem I have in particular with Scorecasting is that they seemed to use the same data that all the rest of us have come up with and (mostly subjectively) jumped to a far more significant conclusion. Even if John’s numbers are more accurately than mine, it seems that the publicity tour for this book has implied that the strike zone effect is far more than even 1/3 of total HFA.

That’s what I’m taking issue with, primarily. It’s Freakanomics all over again.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Jan 30, 2011 7:03 PM EST up reply actions  

I didn't explicitly normalize for count when I ran my analysis

but count was implicitly included in the run value of switching a ball to a strike.

by Dan Turkenkopf on Jan 31, 2011 8:40 AM EST up reply actions  

Catchers' hot zones

So…we’ve seen umpires’ hot zones, rt -& lt-handed hitters hot zones vs rt. & lt. handed pitchers, pitch type and velocity hot zones, etc. Great data. I still want to see catchers’ hot zones: strikes vs balls, for all locations, for all pitch types, based on catchers’ techniques: particularly catching palm up vs palm down for low pitches, glove inward vs glove outward fo pitches to the catcher’s left. I still think techniques that allow the umpire to see the ball as it hits the glove (palm up, palm in) would prove to get more called strikes than sloppy palm down, palm out techniques. However, I would bow to scientific evidence to the contrary.

by Strike Three! on Jan 31, 2011 8:43 PM EST reply actions  

The hard part with that analysis is that no available data source tracks catcher location or glove position

Sportsvision has talked about tracking the catcher’s position, but it hasn’t been added to the dataset yet (if it ever is)

by Dan Turkenkopf on Jan 31, 2011 9:13 PM EST up reply actions  

I can promise you 100%

That if I had the data, I’d have written about it already. What I can check in the long run is if there’s any yty correlation in zone advantage and/or linear weights for different catchers behind the plate. I don’t have that data integrated in my data set yet, however.

Honestly, the yty correlation for pitchers in terms of sone advantage and linear weights is rather small (R=.40), and I’d be surprised if it was any larger for catchers.

Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by J-Doug on Feb 1, 2011 1:49 AM EST up reply actions  

Also in the THT Annual

Sean Smith wrote about this in the THT Annual. He didn’t have the PITCHf/x data, but he used a “WOWY” approach to identify differences between catchers. The general sabermetric principle is that catchers have little effect on ERA, but Sean found that this isn’t true.

by studes on Feb 1, 2011 10:25 AM EST reply actions  

Catcher Hot Zone Cont'd

Dan, Thanks for references! There’s framing, and then there’s framing. Consistent technique for a catcher is just like consistently being around the strike zone for the pitcher…both elicit umpires’ calls to the pitcher’s advantage. When combined the effect may be more than additive….an opinion. But I’m not surprised you found an effect.

by Strike Three! on Feb 1, 2011 12:56 PM EST reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz