After my last post about catcher framing, many people (at least two or three) suggested that the pitching staff had a fair amount to do with the results we were seeing. The poor performance of both the Rangers' catchers (Gerald Laird and Jarrod Saltalamacchia) seemed to lend credence to that stance. I started wondering what factors might influence an umpire to call a pitch a certain way. So I retreated into my stats cave (no, it's not my mother's basement) for a while to see what I could I find out. I'll warn you, this is a very long post, so if you want to skip to the summary at the end, feel free, but you'll miss the graphs.
Before I start examining the different factors, let me outline the methodology. It's the same basic approach as the catcher framing article, which in turn is based on Jonathan Hale's article at The Hardball Times. Each called pitch is compared to the strike zones identified by John Walsh to see whether it was "mistakenly" classified. I put "mistakenly" in quotes because Walsh marked the boundaries of the strike zones where 50% of the pitches were called strikes, so it's expected that some calls will not match. Walsh provides values on the height and width of the strike zone for both left and right handed batters and compares them to the rulebook strike zone for the average batter. I used his values for the strike zone widths, but calculated my own for the strike zone height of each batter, based on his deltas from the rulebook strike zone. I used absolute differences (2.2 inches on the top for right handed batters, for example) rather than percentage differences.
Anyway, once I classified all the pitches, I credited the pitcher (or category of pitchers) for all additional strikes and debit for all additional balls to find how many total "misses" happened. I then calculated the average percentage of misses across all pitches and determined how each pitcher differed from his expected value. Those differences were normalized to 150 opportunites, or roughly the amount in one game. Finally, that number was converted to runs per 150 called pitches by multiplying by .161 runs, which is the value of changing a ball to a strike. This allowed me to compare different factors that might influence how a pitch was called and see how important each one might be.
Game Specific Factors
Certain characteristics of the game might play a part in how an umpire calls pitches. Specifically, I looked at which inning the calls happened, and which team was at bat. Another possible area to explore would be day games versus night games, but I didn't have the data handy to look into that.
|Inning||Called Pitches||Runs / 150 Pitches|
The most interesting piece of information here is the (relatively) large assist given to batters in extra innings. It's almost like the umpires want to go home and unconsciously give batters the benefit of the doubt so as to increase the chances of scoring a run. Of course the effect is pretty small - a difference of one run every 45 innings or so, but it's almost twice as large as any other inning. The fact that a team is more likely to use the back of their bullpen in extra innings could play a role too.
Home or Visitor
|Batting Team||Called Pitches||Runs / 150 Pitches|
A mild advantage to the home pitchers (when the visiting team is batting) - to the tune of one run every 8 games. I expected the home team pitchers to get some favorable calls, but wasn't sure of the impact.
How could a pitcher influence whether close pitches are called strikes? I separated the possiblities into two main categories - demographics and performance. Under demographics, I looked at age, experience and handedness, and for performance I looked at runs allowed, walks allowed and early-game wildness. I know that age and experience are close analogs but I thought I'd see if there were major differences between the two. More on that later.
|Age||Called Pitches||Runs / 150 Pitches|
The numbers jump around a bit, and the sample size for some of the individual years leaves a lot to be desired, but in general, the older you are, the more love you get from the umpires. The effect is much clearer if we look just at age buckets
|Age||Called Pitches||Runs / 150 Pitches|
Pitchers over 35 see almost half a run per game benefit compared to pitchers under 25. There are some concerns with this. There's likely a selection bias as the pitchers who make it to 35 tend to be the better ones, so that might influence how the umpires rule. It's also possible that the determining factor isn't age, but experience. Let's look at that one next.
|Experience||Called Pitches||Runs / 150 Pitches|
There are a few strange negative bumps at 13 and 15 years, but the samples are pretty small there. In general the trend is upwards, and it appears more strongly than with age. Let's look at the buckets of experience.
|Years of Experience||Called Pitches||Runs / 150 Pitches|
I think these understate the actual value of being a long-time pitcher - mostly because a large portion of the sample is in the set of pitchers with 8 years of experience. I'll admit these buckets don't match up well with the age buckets I introduced above, so the comparison is not as easy to make as I might hope.
|Hand||Called Pitches||Runs / 150 Pitches|
An advantage of 1 run for every 8 games or so to righties. I'm not sure why righties might get more favorable treatment. It could have something to do with the direction the pitches break, or maybe it's related to handedness of the batter - where righty pitchers face more left-handed batters than lefties do. Whatever the cause, I'm going to chalk it up to unexplained variation (Mike Emeigh recently suggested that term as opposed to random variation, and I think it's a good idea).
|RA/9||Called Pitches||Runs / 150 Pitches|
|Less Than 4.00||30491||-0.01|
|More Than 6.00||20086||-0.25|
First off, let me make it clear that those runs allowed numbers are for a pitcher's career before 2007 while the runs / 150 pitches arer the 2007 season. There isn't a whole lot of variation between the low RA and the medium RA buckets. However, the high RA bucket loses a quarter of a run per game, or -.03 points of ERA in 2007 to umpires' calls. That suggests that umpires might be buying into a pitcher's bad reputation and reinforcing it. Interestingly, there doesn't appear to be a corresponding positive bump on the other side.
|BB/9||Called Pitches||Runs / 150 Pitches|
|Less Than 2.50||21241||0.18|
|More Than 4.50||24540||-0.22|
Again, the walk rate numbers are pre 2007 while the runs / 150 pitches are 2007. In this case we see that reputation appears to play a part in both directions. Pitchers who previously had low walk rates benefit from the umpires' calls, while those with high walk rates suffer. The difference between the two is .4 runs per game or .04 points of ERA. It's important to realize the walk rate is probably not independent from the rate of runs allowed - in other words, a pitcher who has a high walk rate is likely to give up a lot of runs - so these differences can't just be added together to figure out what part reputation plays in skewing the results.
|Wildness||Called Pitches||Runs / 150 Pitches|
|Low (Less Than 33%)||14704||-0.01|
|Mid (33% - 42%)||23160||0.08|
|High (Greater Than 42)||11431||-0.15|
This one needs some explanation. What I tried to do was identify how wild each starter was in the first two innings of each 2007 start by looking at the percentage of balls he threw. I then grouped them into low, medium and high buckets, and ran my standard analysis. This shows much the same result (although less pronounced) as the runs allowed case - in that the high bucket is hurt, but there's little effect on the low bucket. Basically, it appears that being wild early tarnishes you in the eyes of the umpire, while having good control early doesn't really do much. Again, there's going to be some causal overlap with this measure and the walk rate and runs allowed metrics, so keep that in mind.
So what does this all mean? Very good question. Again, I'm not sure. Since we only have one (partial) season to go on, it's hard to be that confident in the data. I think the sample sizes are pretty good for many of the buckets used in this study, but that's a gut feel rather than anything confirmed mathematically. In a moment, I'll share the results of a regression I ran for individual pitchers based on some of these metrics, but first let's look at a summary chart of all the different ways of breaking this down. The ERA effect is the difference in ERA between the "good" group in the sample (low walk rate, low RA, etc.) and the "bad" group (high walk rate, high RA).
|Age (Buckets)||Over 35||Under 25||.05|
|Experience (Buckets)||Under 2||Over 8||.04|
|Runs Allowed||4.00-6.00||More Than 6.00||.03|
|Walk Rate||Less Than 2.5||More Than 4.5||.04|
Again, please keep in mind these aren't independent so cannot be added together to get a total effect. There's nothing that's all that big here, at least compared to what I was finding for catchers (a spread of over a run per game).
Now let's get to that regression. I took all pitchers from 2007 who had over 300 pitches called (174 pitchers) and regressed the runs / 150 pitches twice - once looking at handedness, age, runs allowed and walk rate, and the second time replacing age with experience. Neither case explained more than .10 of the total variation, so I'm not even going to bother sharing the actual values. If anyone really wants to know, ask and I'll share.
I'm sure there are plenty of other things that might influence an umpire to call pitches a certain way for a given pitcher. A few that I didn't get a chance to look at for various reasons were GB/FB ratio and the pitcher's arsenal. Plus it's going to be nearly impossible to untangle the effect of the catcher and umpire until we have more data. But I think there's still value in examining these characteristics and finding the limits of their effects, even if the information is not conclusive.