Benefit of the Doubt: Odd Patterns in Umpire Compensation
In this installment of Benefit of the Doubt we take a look at the effect of the ball-strike count on the strike zone. This is not a new area or analysis; several sabermetricians before me have done top-notch work on the subject, including Dave Allen and John Walsh. Moreover, both Mr. Allen and I have successfully tested this effect statistically, albeit using slightly different models and methodologies.
Nonetheless, I've uncovered something that I don't believe has been discussed in the past. First, let's review: we know--both intuitively and empirically--that the strike zone changes in response to the count. Take a look at the following heat maps for empty and full counts.
For the unfamiliar, the charts above indicate how likely a pitch is to be called a strike if thrown at a given location. The warmer the color, the more likely a called strike. The strike zone--theoretically the break even boundary between called strikes and balls--is roughly outlined in yellow.
As you can see, both the full count and empty count zones (left) differ from the generic zone (right). More specifically, we know that the zone tends to get smaller in pitcher's counts and larger in batter's counts. Observe:
In the ultimate batter's (3-0) count, the zone is white-hot down the middle and significantly wider on the edges. Likewise, the zone for a pitcher's (0-2) count is noticeably smaller and generally cooler. This corresponds with the theory of the compassionate umpire, if I may steal Mr. Walsh's phrase.
So, we should see these results reflected in the Zone Advantage ratings, yes? Well, as it turns out, the answer is more like "yes, but..."
Over the last two seasons, the 0-2 and 3-0 counts were not the most extreme examples of bias. In fact, both the 0-1 and full counts have favored the batter more than the 0-2 count, and pitchers have done somewhat better with the 2-0 count than the 3-0 count.
| Zone Advantage | Strikes | ||
| Balls | 0 | 1 | 2 |
| 0 | 0.34% | -2.51% | -2.03% |
| 1 | 2.06% | -1.06% | -1.30% |
| 2 | 3.80% | 0.49% | -1.13% |
| 3 | 3.64% | 0.85% | -2.45% |
This does not contradict Mr. Allen's finding that the strike zone is largest at 3-0 and smallest at 0-2. In fact, when it comes to the topology of the zones, our findings are consistent: the expected zone is in fact larger at 0-2 than at 0-1 and smaller at 3-0 than 2-0.
So what gives? Keep in mind that, as the count changes, the zone is not the only thing that changes with it. Perhaps more importantly, pitcher and batter behavior--and thus the distribution of called pitches--changes. Moreover, the distribution of pitches is at all times uneven and, at the micro level, random.
The takeaway: due to the intricacies of player behavior and random variance, the size of the expected strike zone does not always correspond to real-world bias. Both, however, are important in measuring how the umpire's call changes depending on the context of the pitch.
Previous episodes in the Benefit of the Doubt series
- Relievers Who Get the Wide Zone
- Starters Who Get the Wide Zone
- Pitchers Who Were Squeezed
- Batters in the Zone
- Lefty Bats Get the Shaft
- How Pitch Speed and Movement Affect the Zone
Data originate from Darrell Zimmerman's SQL-based PitchFX database.
11 comments
|
0 recs |
Do you like this story?
Comments
Not only does the distribution of pitches change by count
But so does the likelihood of the batter to swing, and when the batter swings he changes the sample which the umpire is presented with by removing pitches from the sample. As I indicated to John Walsh back before he first published The Compassionate Umpire, the swing selection is enough to explain the whole “compassion” effect. The umpire may be doing nothing differently.
Winner, Beyond the Box Score 32 Predictions Contest, 2009
by Mike Fast on Dec 24, 2010 3:33 PM EST reply actions 1 recs
When I say "may"
I don’t necessarily mean “is”, but the magnitude is the right size and it’s very well correlated with the “compassion” effect. Any study investigating this effect must take the swing selectivity of the batters into account if we’re going to figure this out properly.
Winner, Beyond the Box Score 32 Predictions Contest, 2009
One thing I should probably mention specifically
As a result of the batter swing selectivity changing from count to count, PITCHf/x location error is going to appear as umpire bias and will need to be removed if you really want to evaluate the true strike zone the umpire is calling. My rough estimate of PITCHf/x plate location error is one standard deviation of one inch horizontally and a half inch vertically. Thus, if the 3-0 strike zone is two inches wider than the 0-2 strike zone, almost half of that is not real. (I say “almost” rather than exactly half because batters don’t swing at quite 100% of borderline 0-2 pitches.)
Winner, Beyond the Box Score 32 Predictions Contest, 2009
Thanks Mike
The batter swinging is what I was trying to get at when I discussed the distribution of “called” pitches in specific, and the effect of changes in batter/pitcher behavior.
Interesting stuff.
Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.
Right
There are a number of things going on, several of which you mentioned.
One, the pitchers throw to a different distribution of locations based on the count.
Two, the batters swing at a different distribution of locations based on the count, even if the pitch location from the pitchers were unchanged.
Three, the umpire’s decision may be (very likely is) affected by the batter’s choice to swing or not to swing at a pitch. He’s likelier to call a pitch a ball if the batter lays off. If we could look into the mind of the umpires and learn what they would have called all the pitches that the batter chose to swing at, our strike zone boundaries might move.
Four, given two pitches that the PITCHf/x system determined were at identical locations (and we could go further and specify same velocity, break, etc.), if the batter swung at one and not at the other, the one the batter swung at was likely located closer to the strike zone in actuality. In other words, the batter swing and umpire call give us some information about likely PITCHf/x measurement errors.
Five, the population of batters in the different counts is not the same. I’m not sure if this has any effect, so these ideas are simply speculative. Free swingers are more likely to get into pitcher counts. I don’t know how that would affect anything, if at all. Similarly, I’ve wondered if the per-count batter populations are biased by height. I have no reason to believe they are, but it could be checked.
Winner, Beyond the Box Score 32 Predictions Contest, 2009
Hmm, a correction:
The systematic plate location errors due to PITCHf/x calibration problems have a standard deviation of one inch horizontal and a half inch vertical. There is also the random measurement error which Sportvision estimated at a half inch. Add them all up (quadratically) and you get around 1.2 inches of measurement error.
Winner, Beyond the Box Score 32 Predictions Contest, 2009
A question on the systematic bias
Hey Mike,
I’ve been fiddling with the umpire strike zone modeling as well, and I’m curious as to why the Pitch F/X calibration would ‘systematically’ have incorrect measurements in different counts.
Are you saying that the strike zone size measured by the Pitch FX system systematically increases with the number of balls in the count? If so, why would this be the case?
So I guess my question is: if there are systematic errors, then systematic with respect to what? Stadium/park? Count? Just whether or not the pitch is being swung at?
I’m not sure that the systematic bias matters (other than the umpire perception based on whether or not the batter swings) when we only compare pitches that are not swung at. Unless…Is the general implication that, systematically, the ones not being swung at are—as a majority—those that were measured incorreclty because they actually were out of the strike zone? (I think I’m answering my own question here).
Thanks!
Another thought
While counterfactuals are always interesting to think about, I think the counterfactual strike zone (your bias #3) is irrelevant in the discussion of what actually happens (outside of check swings). We’re interested in what actually happens when the batter does not swing, and I think that in and of itself is the interesting thing going on here with the ‘compassionate umpire’. If we’re comparing “not swing on 0-2” vs. “not swing on 3-0”, we’re only comparing the differences. Since the bias is a ‘swing/not swing’, it would be simply be a shift parameter washed out by the fact that it’s present in both counts, no?
Tango has this on the book blog too
Here:
http://www.tangotiger.net/halejon/allcounts.html
My Work: Henkakyuu
Looking at the heat maps there
It seems like the target of the pitcher is very different at different counts which could have a major impact on calling balls and strikes. With 2 strikes, pitchers paint the corners, while they are a lot more down the pike on 3-0. You would expect more definite strikes on 3-0 if the target is down the middle than on 0-2 with all the balls on the periphery, wouldn’t you?
My Work: Henkakyuu
I absolutely would.
Good call, Josh.
Blogger and Editor, Rational Pastime Blog. Twitter: @RationalPastime.

by 

































