Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: Clippers Need To Realize That Spurs Are The Anti-Grizzlies

Batted balls and lady luck

First of all I'd like to thank Marc for extending an invitation to me to become a regular contributor at BtB. I have actually, and almost unbelievably, followed Marc's work since his days at Baseball Rants - don't worry, he couldn't believe it either! I'm hoping to post an article at least every two weeks or so. In fact all my sabermetric articles will appear here in future. Enough of the pleasantries - let's kick-off the inaugural post.

Sabermetricians love data. The beauty of baseball is that there is plenty (dare I say, too much?) of it to get our stat-obsessed claws into. Over the last few years a new type of data have come to the fore: batted balls. Simply put, during every play of every game, some poor soul decides whether a ball-in-play is a groundball, pop-up, line drive or flyball. Fortunately for you and me, this glut of information allows us to enhance our understanding of pitcher batter confrontations.

Questions such as: do flyball pitchers have a higher K/9 rate than groundball pitchers, or are pitchers who induce pop-ups more susceptible to the longball, are no longer the subject of idle speculation, but rather fall in to the realm of quantitative analysis. A pet project of mine over the last few months has been to use batted ball data to better understand the ability of players - both batters and pitchers - to control balls in play. To avoid droning on for too long, and to give you a chance to read some of the other excellent posts on this site, I'll focus on pitching for the rest of this article.

Star-divide

To start with, here is an easy question: do groundball pitchers exist? In other words, if I take a bunch of pitchers with strong groundball tendencies in year 1, will they continue to make plenty of groundouts in year 2? Absolutely. Look at the chart below which plots 2004 groundball in play (GBIP) rate against 2005 GBIP rate for all pitchers with more than 40 BIP in both years.



The r^2 is a shade over 0.5, meaning that 50% of the variance in the 2005 GBIP is explained by 2004 performance. You don't need to be Copernicus to see that getting hitters to groundout is a skill.

Now let's switch our attention to line drives in play (LDIP). Again, as for groundballs, you might expect some pitchers to have an ability to prevent LDIP. Put me and Johan Santana on the mound for a couple of innings and you won't need a calculator to work out who'll have a higher LDIP. Enough speculation, let's take a look at the data:



Whoa - what's going on? There is no correlation whatsoever (r^2: 0.01). Does this mean that giving up line drives is just luck? Hmm, this is contrary to what we first thought, so it's worth probing a bit deeper.

The problem could well be with the correlation technique. Year to year correlations can be unreliable if either:

  1. the sample size is too low
  2. the data contains a number of sub-groups with different means, weakening the correlation
  3. variation in skill between different players is small and undetectable
  4. some combination of the above

Let's take a closer look. The sample size is 369 - definitely on the small side but presidential elections have been decided on less. There is a fair chance that the random noise in two years of data masks any skill. I'd definitely want to see three to four years of data before I was comfortable with the LDIP conclusion. Now, we could use BIS (Baseball Information Solutions) which tracks LDIP for the past four years, but there are problems with batted ball codings which, I am led to believe, make the data unreliable.

OK, time for a change of tack. Can we cut the data another way to discern the skill impact of a line drive? One option is to divide the 2004 data into, say, 6 buckets of increasing LDIP percentage. This also helps to analyze the impact of sub-groups (note: although we haven't explicitly defined the sub-groups, the bucket approach should broadly capture them if they exist). If there was no skill involved we would expect that the pitchers in a particular 2004 LDIP bucket to be evenly distributed across all the buckets in 2005. With a sample size of 369 and 36 possible buckets, the expected frequency for each bucket is a fraction over 10. This is what we actually get:



This looks a little better. Again we are at the mercy of a small sample size but we can see that among the elite group of LDIP pitchers (upper left part of the table, shaded blue) the frequencies are actually way greater than 10 - we could finally be detecting that elusive skill! Looking at the worst LDIP pitchers (bottom right corner, shaded gray), again the frequencies point to some sort of repeatable skill. Let's try to statistically quantify this using a Chi Squared test. Plug the numbers in and we get a p-value of 0.035. Bingo! Despite the small sample size it appears that pitchers do have some control over their LDIP rate (at a 95% confidence interval).

OK, I know what you are thinking. Sure, there might be a small difference in skill but so what, it is practically unmeasurable so is unimportant. Not so fast ... even a tiny variation in skill can result in a big impact if the value of the event is high enough. And line drives are valuable: an astonishing 75% fall in play and most of those go for extra bases. Because the skill element is so small we need to use a different technique to tease it out.

We know that hitting line drives is a binomial event: every at-bat results in either a 1 (LDIP) or 0 (no LDIP) If we add up these 1s and 0s and divide by BIP we get LDIP%. So if LDIP were 100% luck we'd expect the variance in the data to be exactly the same as that predicted by the binomial distribution. But we showed above that LDIP has a skill element, so actually we expect the variance to be greater than that predicted by the binomial distribution. Got it? No? Right; an example might help.

Consider two coin flippers, both who have the same skill level (ie, both flip heads 50% of the time). If you work out the actual variance in the data and compare it to that predicted by the binomial distribution it should be identical. However, suppose our two coin flippers have different skill levels!! If one flips heads 75% of the time and the other 25% of the time then the combined mean remains the same. But, and here it gets interesting, the variance is much larger than when both players had the same probability of flipping heads. Why? Because of the difference between the players' skill.

Right, back to baseball and LDIP. From the binomial distribution (everyone dust down their stat texts) we expect the standard deviation of LDIP to be .022 (mean: 0.193). What does the data show? .033. High fives all round, a difference of .025 (std. dev.), which is directly attributable to player skill. Hang on, don't get too carried away just yet. All we are saying is that given an average LDIP of 19.3%, approximately 67% of pitchers will have a LDIP skill between 18.2% and 20.4%. An average pitcher may have 250 BIPs a season, so we are talking a difference of 4 line drives! The skill exists, but you need to be superhuman to detect it by eye.

OK, let's try and put a sense of perspective on this. If we want to predict a pitchers' LDIP for 2006 based on his 2005 data, by how much do we need to regress to the mean? A rough formula is to divide the expected population variance by the skill variance weighted by 1/BIP. This gives us 800. What we are saying is that a pitcher has to allow 800 BIP before we can regress his LDIP rate by half. Despite the noise of the y-t-y correlation, it would appear that pitchers like Santana, Cordero and Gordon, who give up relatively few line drives, know what they are doing after all ... sort of!

[Update] I have updated the numbers in the last paragraph based on input from Andy Dolphin on my calculation of regression to the mean. This number is incredibly sensitive to BIP. The numbers above assume a weighted 1/BIP of 1/177 for 2004 and 2005 data combined.

Caveat: if you just use a 2005 sample then 1/BIP is 1/140, and it bang on meets the binomial distribution.

The conclusion: there is probably a very small element of skill in line drives, but it is just that, very small

Thank-you Andy for your input.

Comment 12 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

OK practicalities brainstorm
How can this be used? For example, we know that groundball pitchers could be very valuable to teams that:

1)have excellent middle infielders (and to a lesser extent, corner infielders
2)play in colorado or cincinatti

Now as you said, line drives are bad, theyre usually hits (75%) - so preventing them is valuable. OK. But are there any places where preventing them is less valuable, or more valuable? I'm trying to think of reasons why a line drive preventer or enabler might be further enhanced by park factors. Or, if a staff is full of line drive pitchers, how to compensate with fielders? Thoughts?

by cephyn on Apr 4, 2006 2:21 PM EDT reply actions  

Interesting approach
I enjoyed this article. How do you get the skill variance?

Would it take 5-6 years to find any meaningful differnce in line drive late? I think that is what you are saying but I just want to be sure.

Also, what might the run or win value be of the differences? There might only be a small difference across pitchers in line drive rate, but how many hits would that mean over the course of the year and how much run value might those hits have?

by Cyril Morong on Apr 4, 2006 2:25 PM EDT reply actions  

Heteroskedastic
But doesn't this raise charges that your conclusions are based on heteroskedastic data? I'm not clear how many samples per bucket you used, but if there's 369 over 36, it doesn't seem to me like you've got enough to go around to make any kind of justifiable conclusion whatsoever.

by scareduck on Apr 4, 2006 3:48 PM EDT reply actions  

Additional Comments
Can be found at Baseball Think Factory.

Thanks to Repoz for the link.

"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Apr 4, 2006 4:08 PM EDT reply actions  

Responses to questions .....
Thanks for the comment Cyril, much appreciated.
On the skill variance I actually made an error, I think. All I do is subtract the variance in the data (x - x(bar))^2/(N-1) from the variance predicted by the binomial distribution. What I did was subtract the standard deviations, not the variances, which was stupid since standard deviations are not additive because of the square root (doh!). This actually make a big difference as the number of BIP to regress 50% to the mean is ~250 - which is one season's worth for the average pitcher.
I can sort of rationalize this. The contingency table detects the skill so we know it is there (using 1 years worth of data). And although we regress only 50% of the way to the mean the effect is still very small as the standard deviation 3 percentage points, which is tiny.
Hmmm .... but you'd have thought it more likely that a y-t-y correlation would pick it up, but it doesn't. I'd be interested to see what we pick up with more data points
Anyway I have amended the article to reflect the correction. Do you have any thoughts, definitely appreciate input here? Something doesn't quite feel right.
So, what I am saying now is that it would take 290 BIP before you would regress to 50% of the mean of the pitcher's cummulative LDIP rate over that period.
In terms of run value. If you assume that the average LDIP goes for a double (ie 0.7 runs) then with only 25% resulting in an out of the player at the plate that roughly translates to 0.5 runs per line drive. So a difference of 4 line drives per year from a particular pitcher translates to about 2 runs - which is about 1/5th of a win. (All the above numbers are very rough but feel in the right ballpark).
On fielding ..
To be honest I haven't given it too much thought. Line drives are normally hit hard, so you need better positioning (possibly) and certainly better reactions / awareness. Because line drives have less time in the air than fly balls I'd expect it would be harder to prevent line drives dropping safe in parks with more "cavernous" outfields,

by John Beamer @ Beyond the Box Score on Apr 4, 2006 4:16 PM EDT reply actions  

Heteroskedacity
Scareduck,

Heterskedacity shouldn't be a problem I don't think as I am using a non-parametric test for that part (chi-square)

by John Beamer @ Beyond the Box Score on Apr 4, 2006 5:13 PM EDT reply actions  

BIP Rate Per Season
Hi Andy

Thanks for your comments. The method you outlined above was what I used (weighting for average BIP).

However, the average BIP (for BIP > 40) I found was ~300 - not 140.

I am calculating BIP as essentially as BFP - BB - HBP - K - HR.

John

by John Beamer @ Beyond the Box Score on Apr 6, 2006 2:18 AM EDT reply actions  

Averages
I'm pretty sure that something is not correct here.  There were 516 pitchers in 2004 with at least 40 balls hit into play.  On average, they had 254 balls hit into play.  So an average pitcher, one who hat 254 balls hit into play, should expect his LDIP rate to include a random factor of +- sqrt(.193*.807/254) = .025.

If we're considering players who had at least 40 balls hit into play in both seasons, the average increases to 304 balls hit into play in 2004 for 375 pitchers, which by the same math would imply an expected random factor of +-.023.

But this is wrong.  If you're trying to compute the expected standard deviation of a sample of players, you need to average the random variance of all those players.  In other words, you take the average of .193*.807/NBIP, and then take the square root of this value.

So in simple terms, you're averaging 1/NBIP, not NBIP.  In this case, the average of 1/NBIP equals .0071 = 1/141 for the 516 pitchers with 40+ BIP in 2004, or .0056 = 1/179 for the 375 pitchers with 40+ in both seasons.  For these samples, the expected random variation is thus +-.033 and +-.029, respectively.

Now, I don't know if your final calculations were done for pitchers who had 40+ BIP in 2004, or if you required both years (2004 and 2005).  But regardless, the random standard deviation should be around .03, not the .022, and thus you have little (if any) evidence that some major league pitchers are particularly good or bad at preventing line drives.

by AED on Apr 6, 2006 6:09 PM EDT reply actions  

Re: sensitive
So is this just a case of something that needs a great deal more data to work with in order to see if it is an indentifiable skill?
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Apr 8, 2006 12:00 PM EDT reply actions  

Welcome to the site.
Reading "The Book" right now - very interesting.

by salb918 on Apr 11, 2006 11:20 AM EDT reply actions  

As am I
I really enjoy what I've read so far Andy.
"I don't set the rosters, I just make fun of the guy who does" - Rob Neyer

by Marc Normandin on Apr 11, 2006 7:06 PM EDT up reply actions  

My take on this ...
I still feel that preventing LDIP is a skill. The chi squared analysis does indicated the possibility that a skill element exists. And the regression to the mean also supports some skill element on 2 years of data (which is not enough).

I am afraid it is the case of waiting and collecting more data in 2006/7. I might try and see if there is any other available data out there (some of the stuff used in fielding systems might work) and see if I can progress this over the coming months.

Any ideas will be greatly received.

by John Beamer @ Beyond the Box Score on Apr 11, 2006 5:12 PM EDT reply actions  

Comments For This Post Are Closed


User Tools

We use numbers and stuff.
Community Guidelines
Why be a member?

Follow us on Facebook!

Follow us on Twitter!

SaberGraphics

Yahoo_full_count

MLB Daily Dish

Get the latest MLB Trade Rumors, Transactions, and News at MLB Daily Dish!


Managing Editor:

Jbopp-kc_small Justin Bopp

Columnists:

Adam_small adarowski

Dme_small Satchel Price

Closeup4_small J-Doug

Carlosicon_small Julian Levine

Billy_and_daddy_4th_of_july_small Bill Petti

Featuring:

Dayton_small Jeff Zimmerman

12475953_small Jacob Peterson

Recent_pic_pg_small Patrick Gordon

Btbpro_small Dave Gershman

Me_small Bryan Grosnick

229331_10150183361996591_674441590_6760167_6637860_n3_small Lewie Pollis

Img_3830_small David Fung

30472_1481067225243_1190689185_1381415_997334_n_small Glenn DuPaul

1mnvxku7_small joshuaworn

Set_small MattFilippi18

Photo0011_small Nathaniel Stoltz