Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: NFL Players Ready To Welcome Gay Teammate

Pitcher Influence on line drives, revisted, also clutchness.

Ever since Voros came out with DIPS at the beginning of last decade, a pitcher's influence on BABIP has been a hotly debated topic.  In 2005, David Gassko revisited DIPS and BABIP, breaking down BIP into component FB, GB, and LD's.  He concluded that although pitchers could influence FB and GB rates, they had little to no influence over LD-rate.  John Beamer brought evidence to bear that this wasn't entirely true and that pitchers had a limited, albeit difficult to detect influence on line drive rate.  Gassko counters back that DIPS 3.0 explains the expected amount of variance from BABIP(I think Gassko mis-interprets our original paper regarding BABIP variance, however).  

Here, I present evidence that pitchers actually have a very significant amount of control over LD-rate, and that it is detectable using season-to-season within-start comparisons and situationally, using aggregate career data.

Star-divide

I wrote an article about Jonathon Sanchez's pitching progression across three seasons, 2008-2010.  In it, I present the following graph:

7507_p_daily_full_9_20101003_medium

via www.fangraphs.com

What jumps out immediately is the significant jump in FB rate from 2008 to 2009 accompanied by the inverse decrease in LD rate from 2008 to 2009.  Indeed, Sanchez's LD rate drops from 21.4% to 16.2% whilst his FB-rate jumps from 37.5% to 43.1%.  According to DIPS 3.0 theory, the LD-rate should regress to league average, and the FB-rate change should be expressed entirely in terms of the FB/GB ratio.  I argue that the LD-rate is a statistically significant drop and reflects a true change of talent-level.  A t-test performed on the graph via manually measuring the data points confirms that the likelihood that 2008 LD rates per start are drawn from the same distribution as 2009 LD rates per start is miniscule (p<0.01).  The datapoints for the t-test were created by manually measuring the height of the LD-rate point for each start.  Ideally, the raw data could be used, but I don't know how to access that data on fangraphs.

So why haven't we noticed this before?  And aren't I just cherry-picking one pitcher? I posit that the overall noisiness in LD-rate data obscures this fact, and that we are also losing a significant amount of chronological data when we discard what happens to LD-rate per start and aggregate it into one binomial function representing the whole season.  A t-test comparing 21.4% to 16.2% would never come out significant, but breaking the data down and binning by start does.  Yes, I am cherry picking one starter amidst many.  If anyone can let me know how I can easier access the raw data at Fangraphs of GB/FB/LD rate by start, I would love to know, and I'll be able to look at more aggregate data.  Until then, let's look at some more hand-scraped data from Fangraphs.

If pitchers do indeed have control over LD-rates, wouldn't you expect them to exercise it within game?  Line drives lead to doubles and homers.  Most dominant pitchers don't mind a line drive or two early in the game, but late in the game with the game on the line, they wouldn't want to give one up, especially with men on base.  This would be a measure of clutchness of the pitcher, which has hitherto existed mostly as a mythological beast, hunted by sabermetricians everywhere.

I took the top 5 pitchers by WAR over the last three years and recorded their career LD% and K/9 in low, medium, and high leverage situations.

LD-rate

K-rate

LI:

low

med

hi

low

med

hi

Roy Halladay

19.2

19

17.9

7.29

6.5

5.73

Cliff Lee

20

20.2

16.9

7.22

6.78

5.99

Tim Lincecum

19.1

19.4

16.7

10.56

9.85

9.05

Zack Greinke

20.6

21

16.8

7.87

7.47

6.24

CC Sabathia

19.3

20.7

18.3

8.09

7.13

6.41

 

Single-tailed Anova, p<0.001

65063720_medium


Seems like the old-adage is true.  When the going gets tough, the tough get clutchy.  The top pitchers seem to trade K-rate in order to decrease LD-rate. There may be selection bias involved in the K-rate.  They get into more jams if their stuff isn't as sharp (reflected in their  K-rate), so therefore situations that have high leverage naturally will have a lower K-rate. Collectively as a group, they do seem to purposely suppress LD-rate in high leverage situations.

Again, I'd love to run a more comprehensive analysis, if I didn't have to manually copy the data off the fangraphs web interface.

Comment 4 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

Interesting work.

On the last part though, I think that’s just as attributable to hitters trying to avoid the K and chasing pitches they might otherwise normally take. I’m not entirely sure why pitchers would want to trade K rate for LD rate anyway. I assume they just want to get the guy out, and avoiding the “productive out” via a K would be a plus.

by tarlinian on Jan 15, 2011 4:22 AM EST reply actions  

Don't have time to really get into this, but...
According to DIPS 3.0 theory, the LD-rate should regress to league average, and the FB-rate change should be expressed entirely in terms of the FB/GB ratio.

That’s an absolutely terrible way of looking at it. You can’t regress FB rate independent of LD%.

by cwyers on Jan 15, 2011 4:10 PM EST reply actions  

Agreed, but ...

I think Gassko meant that FB/GB could be treated as the predictive skill, and LD-rate would always go to league avg.

by Nivra on Jan 16, 2011 2:47 AM EST up reply actions  

You’ve cherry-picked pitchers, like you said. For this to be true, there would need to be some repeatability to this skill. There is not. The sum of variances approach that John Beamer takes in his article neglects the obvious effect of scorer bias on line drives. In my work, I’ve found that pretty much any persistence in LD% for pitchers is attributable to scorer bias, and that matches up with what Gassko found too.

To expand on the cherry-picking issue, a t-test says “what are the odds that a randomly selected pitcher will be this far from the mean by chance?” P=.01 means that 1 out of every 100 pitchers will be that far from the mean, so statistically there have to be pitchers to cherry-pick.

by Matt Swartz on Jan 18, 2011 10:45 AM EST reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz