Pitcher Influence on line drives, revisted, also clutchness.
Ever since Voros came out with DIPS at the beginning of last decade, a pitcher's influence on BABIP has been a hotly debated topic. In 2005, David Gassko revisited DIPS and BABIP, breaking down BIP into component FB, GB, and LD's. He concluded that although pitchers could influence FB and GB rates, they had little to no influence over LD-rate. John Beamer brought evidence to bear that this wasn't entirely true and that pitchers had a limited, albeit difficult to detect influence on line drive rate. Gassko counters back that DIPS 3.0 explains the expected amount of variance from BABIP(I think Gassko mis-interprets our original paper regarding BABIP variance, however).
Here, I present evidence that pitchers actually have a very significant amount of control over LD-rate, and that it is detectable using season-to-season within-start comparisons and situationally, using aggregate career data.
I wrote an article about Jonathon Sanchez's pitching progression across three seasons, 2008-2010. In it, I present the following graph:
What jumps out immediately is the significant jump in FB rate from 2008 to 2009 accompanied by the inverse decrease in LD rate from 2008 to 2009. Indeed, Sanchez's LD rate drops from 21.4% to 16.2% whilst his FB-rate jumps from 37.5% to 43.1%. According to DIPS 3.0 theory, the LD-rate should regress to league average, and the FB-rate change should be expressed entirely in terms of the FB/GB ratio. I argue that the LD-rate is a statistically significant drop and reflects a true change of talent-level. A t-test performed on the graph via manually measuring the data points confirms that the likelihood that 2008 LD rates per start are drawn from the same distribution as 2009 LD rates per start is miniscule (p<0.01). The datapoints for the t-test were created by manually measuring the height of the LD-rate point for each start. Ideally, the raw data could be used, but I don't know how to access that data on fangraphs.
So why haven't we noticed this before? And aren't I just cherry-picking one pitcher? I posit that the overall noisiness in LD-rate data obscures this fact, and that we are also losing a significant amount of chronological data when we discard what happens to LD-rate per start and aggregate it into one binomial function representing the whole season. A t-test comparing 21.4% to 16.2% would never come out significant, but breaking the data down and binning by start does. Yes, I am cherry picking one starter amidst many. If anyone can let me know how I can easier access the raw data at Fangraphs of GB/FB/LD rate by start, I would love to know, and I'll be able to look at more aggregate data. Until then, let's look at some more hand-scraped data from Fangraphs.
If pitchers do indeed have control over LD-rates, wouldn't you expect them to exercise it within game? Line drives lead to doubles and homers. Most dominant pitchers don't mind a line drive or two early in the game, but late in the game with the game on the line, they wouldn't want to give one up, especially with men on base. This would be a measure of clutchness of the pitcher, which has hitherto existed mostly as a mythological beast, hunted by sabermetricians everywhere.
I took the top 5 pitchers by WAR over the last three years and recorded their career LD% and K/9 in low, medium, and high leverage situations.
|
LD-rate |
K-rate |
|||||
|
LI: |
low |
med |
hi |
low |
med |
hi |
|
19.2 |
19 |
17.9 |
7.29 |
6.5 |
5.73 |
|
|
20 |
20.2 |
16.9 |
7.22 |
6.78 |
5.99 |
|
|
19.1 |
19.4 |
16.7 |
10.56 |
9.85 |
9.05 |
|
|
20.6 |
21 |
16.8 |
7.87 |
7.47 |
6.24 |
|
|
19.3 |
20.7 |
18.3 |
8.09 |
7.13 |
6.41 |
Single-tailed Anova, p<0.001
Seems like the old-adage is true. When the going gets tough, the tough get clutchy. The top pitchers seem to trade K-rate in order to decrease LD-rate. There may be selection bias involved in the K-rate. They get into more jams if their stuff isn't as sharp (reflected in their K-rate), so therefore situations that have high leverage naturally will have a lower K-rate. Collectively as a group, they do seem to purposely suppress LD-rate in high leverage situations.
Again, I'd love to run a more comprehensive analysis, if I didn't have to manually copy the data off the fangraphs web interface.
4 comments
|
0 recs |
Do you like this story?
Comments
Interesting work.
On the last part though, I think that’s just as attributable to hitters trying to avoid the K and chasing pitches they might otherwise normally take. I’m not entirely sure why pitchers would want to trade K rate for LD rate anyway. I assume they just want to get the guy out, and avoiding the “productive out” via a K would be a plus.
Don't have time to really get into this, but...
According to DIPS 3.0 theory, the LD-rate should regress to league average, and the FB-rate change should be expressed entirely in terms of the FB/GB ratio.
That’s an absolutely terrible way of looking at it. You can’t regress FB rate independent of LD%.
Agreed, but ...
I think Gassko meant that FB/GB could be treated as the predictive skill, and LD-rate would always go to league avg.
You’ve cherry-picked pitchers, like you said. For this to be true, there would need to be some repeatability to this skill. There is not. The sum of variances approach that John Beamer takes in his article neglects the obvious effect of scorer bias on line drives. In my work, I’ve found that pretty much any persistence in LD% for pitchers is attributable to scorer bias, and that matches up with what Gassko found too.
To expand on the cherry-picking issue, a t-test says “what are the odds that a randomly selected pitcher will be this far from the mean by chance?” P=.01 means that 1 out of every 100 pitchers will be that far from the mean, so statistically there have to be pitchers to cherry-pick.

by 






























