Filed under:

# How Much Do We Really Know About the Strike Zone?

I was originally going to title this article "Is Everything We Know About the Strike Zone a Lie?"  I figured that was a nice juicy headline to draw in traffic and generate controversy.  But then I re-looked at the numbers, got a whole lot more confused about what they mean and decided to scale back my Drudginess.

One the major assumptions we make about the Pitch F/X dataset is that we have a pretty good idea of what the strike zone is.  It's been studied a bunch of times, starting with John Walsh, and continuing with our own Jeff Zimmerman, each time refining the answer a bit.  But we assume that the strike zone we use is consistent across the entire data set.  Yes, we realize that the top and bottom of the strike zone vary from at-bat to at-bat, so in general we use the median value for each batter.  And we know that the boundaries of the strike zone are somewhat flexible.  Walsh determined his strike zone at point where 50% of the pitches are called strikes.  Jeff used 85% strikes as his boundary value.  And of course the zones differ according to batter handedness.  But we've assumed that the strike zone is relatively consistent from park to park.  That assumption appears to be pretty far from the truth.

My research into park strike zones came about while trying to repeat this study on catcher framing from a few years back.  My major issue at the time was the magnitude of the effect - the difference between the best and worst catchers was 25 wins.  When I tried again with more recent data, the initial effect was even bigger at 5 runs per game - or roughly 60 wins per season. That completely failed the smell test.  I'm pretty confident in my methodology, despite not yet adjusting for umpire.  And similar results were found by Bill Letson, who's approach was a lot more rigorous than mine..

So I started thinking about what else could cause such a discrepancy, and decided to look into park factors for strike calls.  Now you wouldn't expect there to be much in the way of park factors.  I believe the semi-official stance on how the Pitch F/X system works is that the pitch as it crosses the plate is correct to within half an inch.  Since the strike zone relative to home plate is the same in each park, the amount of "missed" pitches should be roughly the same in all parks, right?  Of course you could argue that perhaps umpires make a difference, but they should be allocated fairly randomly across parks.  But perhaps I'm getting a little ahead of myself.  Let's look at the data for 2009.

In Home Park In Away Park In Home Park In Away Park
Year Team LG Called H Called A Called H Called A Fstrikes H Fstrikes A Fballs H Fballs A Fstrikes H Fstrikes A Fballs H Fballs A PF
2009 ANA AL 6699 6637 6992 6513 603 691 225 246 693 525 255 228 113
2009 ARI NL 6193 6063 6275 5866 617 498 197 195 577 550 218 204 101
2009 ATL NL 5959 5781 6113 6085 684 508 132 186 558 574 243 153 121
2009 BAL AL 6213 5967 6161 6293 655 517 199 199 477 525 237 226 142
2009 BOS AL 6113 6604 7332 6756 538 554 261 239 695 534 279 321 105
2009 CHA AL 6178 5806 6073 5781 514 413 220 183 570 502 230 195 81
2009 CHN NL 6287 5855 6034 5952 478 473 216 203 550 486 216 200 86
2009 CIN NL 6104 5719 5920 6230 608 436 139 167 495 554 207 182 113
2009 CLE AL 6347 6258 6695 6090 465 521 275 223 663 437 259 268 89
2009 COL NL 6025 6528 6689 5931 497 574 213 233 638 492 214 201 89
2009 DET AL 6264 5871 6025 6272 436 479 259 194 483 435 241 262 112
2009 FLO NL 6685 6532 6496 6044 581 544 261 218 647 508 181 219 83
2009 HOU NL 6165 5530 5799 5987 458 415 246 205 513 466 195 218 76
2009 KCA AL 6442 5882 5739 6233 593 506 210 170 467 460 207 220 135
2009 LAN NL 6291 6428 7318 6473 581 593 239 288 653 572 242 196 90
2009 MIL NL 6482 6431 6472 6177 616 628 244 253 622 472 232 227 115
2009 MIN AL 6157 6423 6332 5702 598 446 248 325 505 483 280 196 89
2009 NYA AL 7303 7396 7097 6487 599 698 241 271 641 510 244 219 105
2009 NYN NL 6188 5995 6181 5909 494 555 211 166 592 425 188 235 112
2009 OAK AL 6096 5881 6153 6278 541 472 253 241 471 461 239 237 116
2009 PHI NL 7034 7024 7197 6646 695 505 185 233 529 586 292 227 126
2009 PIT NL 5763 5764 5906 5600 416 488 257 172 512 369 183 292 117
2009 SDN NL 6285 5913 5963 5947 476 377 281 287 503 500 219 213 50
2009 SEA AL 6047 5593 5574 5800 371 412 250 194 418 390 204 238 92
2009 SFN NL 6149 4972 4870 5968 579 457 200 141 369 526 147 221 127
2009 SLN NL 5967 5456 5592 5598 618 403 176 220 419 540 215 157 102
2009 TBA AL 5992 6227 6525 5921 518 472 228 238 514 538 241 238 93
2009 TEX AL 6404 5657 5649 5900 538 366 257 300 392 556 217 198 62
2009 TOR AL 5967 6045 6356 6075 486 440 238 239 567 523 217 233 74
2009 WAS NL 6108 6287 6379 6011 456 512 190 216 606 483 209 221 86

FStrikes are called strikes that were outside the zone (for Fake Strikes), while FBalls were balls that were inside the zone.  H and A are home and away, and PF is park factor.  A low number is more pitcher friendly (more strikes than expected), while a high number is better for batters (more balls than expected).

You can see a huge spread in how likely a pitch was to be mis-called based on park in 2009.  2007 and 2008 were no better.

Here are the unweighted three year park effects for each team's stadium:

Team 2007 2008 2009 Avg
ANA 107 116 113 112
ARI 79 115 101 98
ATL 129 112 121 121
BAL   119 142 131
BOS 66 108 105 93
CHA 59 83 81 74
CHN 122 109 86 106
CIN 67 106 113 95
CLE 70 82 89 80
COL 77 111 89 92
DET 195 130 112 146
FLO 30 110 83 74
HOU 175 103 76 118
KCA 116 108 135 120
LAN 69 90 90 83
MIL 94 95 115 101
MIN 65 76 89 77
NYA     105 105
NYN     112 112
OAK 101 107 116 108
PHI 97 99 126 107
PIT 141 149 117 136
SDN 101 92 50 81
SEA 159 68 92 106
SFN 105 76 127 103
SLN 104 96 102 101
TBA 199 122 93 138
TEX 63 72 62 66
TOR 120 116 74 103
WAS   89 86 88

The correlation between 2008 and 2009 is a fairly robust 0.36, which suggests there's at least some actual phenomenon here.

What might cause such discrepancy from park to park?  You wouldn't imagine the strike zone would vary based on the park.  It's a fairly static thing and not subject to the placement of the outfield fences, or the size of outfield, or even the length of the infield grass.

So if the cause is unlikely to be one of the normal factors that influence park effects, what might be some less obvious reasons for the difference?

According to Alan Nathan, the camera position differs from park to park, which might not be completely corrected for by Sportsvision's software.

Maybe there's something to the hitter's background at certain parks that affects the umpire's ability to call balls and strikes.  If that were the case, you'd think we'd have heard some complaints at some point.

It's also possible that the assignment of umpires doesn't wash out when looking at the results of a single season.  Perhaps Texas sees more than it's fair share of pitcher-friendly umpires.  This option seems unlikely since we see similar results from season to season and it seems like any unintentional umpire scheduling bias wouldn't carry across multiple years.

There's a slim chance that certain pitchers were more likely to pitch at home than on the road and they could throw the numbers off.  But that would be unlikely to occur for all teams, and, again, would likely be a single year effect.  The same goes for catchers.

Unfortunately, since I don't have a better explanation, I'm tending to believe the first one - something is different about camera placement from park to park, and it's affecting how pitches are recorded.

It's just conjecture at this point and I'd love someone from Sportvision to tell me I'm wrong, but I'm a little nervous about the correctness of the Pitch FX coordinates.  If we can't count on those from park-to-park, then a lot of studies need to be questioned.

We've known for a while that Pitch FX needed to be corrected for park.  It's one of the things Josh Kalk was working on before being hired by Tampa Bay.  But my understanding was that was mainly for the pitcher's side of things (release point, etc.) and that the values at the plate were correct to within a fraction of a inch.

I'm less convinced that's true now.  Undoubtedly, I don't have a whole lot of evidence - more of a gut feel that something is not right with these results.

It's clearly of great importance to the sabermetric community to be able to trust the Pitch FX numbers.  And right now, my faith is a little bit shaken.