I was originally going to title this article "Is Everything We Know About the Strike Zone a Lie?" I figured that was a nice juicy headline to draw in traffic and generate controversy. But then I re-looked at the numbers, got a whole lot more confused about what they mean and decided to scale back my Drudginess.
One the major assumptions we make about the Pitch F/X dataset is that we have a pretty good idea of what the strike zone is. It's been studied a bunch of times, starting with John Walsh, and continuing with our own Jeff Zimmerman, each time refining the answer a bit. But we assume that the strike zone we use is consistent across the entire data set. Yes, we realize that the top and bottom of the strike zone vary from at-bat to at-bat, so in general we use the median value for each batter. And we know that the boundaries of the strike zone are somewhat flexible. Walsh determined his strike zone at point where 50% of the pitches are called strikes. Jeff used 85% strikes as his boundary value. And of course the zones differ according to batter handedness. But we've assumed that the strike zone is relatively consistent from park to park. That assumption appears to be pretty far from the truth.
My research into park strike zones came about while trying to repeat this study on catcher framing from a few years back. My major issue at the time was the magnitude of the effect - the difference between the best and worst catchers was 25 wins. When I tried again with more recent data, the initial effect was even bigger at 5 runs per game - or roughly 60 wins per season. That completely failed the smell test. I'm pretty confident in my methodology, despite not yet adjusting for umpire. And similar results were found by Bill Letson, who's approach was a lot more rigorous than mine..
So I started thinking about what else could cause such a discrepancy, and decided to look into park factors for strike calls. Now you wouldn't expect there to be much in the way of park factors. I believe the semi-official stance on how the Pitch F/X system works is that the pitch as it crosses the plate is correct to within half an inch. Since the strike zone relative to home plate is the same in each park, the amount of "missed" pitches should be roughly the same in all parks, right? Of course you could argue that perhaps umpires make a difference, but they should be allocated fairly randomly across parks. But perhaps I'm getting a little ahead of myself. Let's look at the data for 2009.
|In Home Park||In Away Park||In Home Park||In Away Park|
|Year||Team||LG||Called H||Called A||Called H||Called A||Fstrikes H||Fstrikes A||Fballs H||Fballs A||Fstrikes H||Fstrikes A||Fballs H||Fballs A||PF|
FStrikes are called strikes that were outside the zone (for Fake Strikes), while FBalls were balls that were inside the zone. H and A are home and away, and PF is park factor. A low number is more pitcher friendly (more strikes than expected), while a high number is better for batters (more balls than expected).
You can see a huge spread in how likely a pitch was to be mis-called based on park in 2009. 2007 and 2008 were no better.
Here are the unweighted three year park effects for each team's stadium:
The correlation between 2008 and 2009 is a fairly robust 0.36, which suggests there's at least some actual phenomenon here.
What might cause such discrepancy from park to park? You wouldn't imagine the strike zone would vary based on the park. It's a fairly static thing and not subject to the placement of the outfield fences, or the size of outfield, or even the length of the infield grass.
So if the cause is unlikely to be one of the normal factors that influence park effects, what might be some less obvious reasons for the difference?
According to Alan Nathan, the camera position differs from park to park, which might not be completely corrected for by Sportsvision's software.
Maybe there's something to the hitter's background at certain parks that affects the umpire's ability to call balls and strikes. If that were the case, you'd think we'd have heard some complaints at some point.
It's also possible that the assignment of umpires doesn't wash out when looking at the results of a single season. Perhaps Texas sees more than it's fair share of pitcher-friendly umpires. This option seems unlikely since we see similar results from season to season and it seems like any unintentional umpire scheduling bias wouldn't carry across multiple years.
There's a slim chance that certain pitchers were more likely to pitch at home than on the road and they could throw the numbers off. But that would be unlikely to occur for all teams, and, again, would likely be a single year effect. The same goes for catchers.
Unfortunately, since I don't have a better explanation, I'm tending to believe the first one - something is different about camera placement from park to park, and it's affecting how pitches are recorded.
It's just conjecture at this point and I'd love someone from Sportvision to tell me I'm wrong, but I'm a little nervous about the correctness of the Pitch FX coordinates. If we can't count on those from park-to-park, then a lot of studies need to be questioned.
We've known for a while that Pitch FX needed to be corrected for park. It's one of the things Josh Kalk was working on before being hired by Tampa Bay. But my understanding was that was mainly for the pitcher's side of things (release point, etc.) and that the values at the plate were correct to within a fraction of a inch.
I'm less convinced that's true now. Undoubtedly, I don't have a whole lot of evidence - more of a gut feel that something is not right with these results.
It's clearly of great importance to the sabermetric community to be able to trust the Pitch FX numbers. And right now, my faith is a little bit shaken.